Tarsnap / tarsnap-gui

Cross-platform GUI for the Tarsnap backup service.
https://www.tarsnap.com
BSD 2-Clause "Simplified" License
251 stars 25 forks source link

Improve macOS support #560

Open gperciva opened 10 months ago

gperciva commented 10 months ago

There's a bunch of problems with current macOS. These need to be fixed on git master, and then backported into a new 1.0.3 release.

Likely in a later release:

amarendra commented 10 months ago

Sonoma. Using after copying Tarsnap.app from the homebrew versioned folder.

Trying the register machine step in Setup Wizard results in transaction progress forever - It has gone on 2-3 hours once I noted. By now I have tried it 7-8 times. (The already in progress part shows when I try to click register button again - but even otherwise it keeps going on forever without resulting in any error or success). I do not know for how long it is supposed to do this. I think it should not take much time because I have not even started backing up yet.

In Network Activity Monitor shows Received Packets changing but for how long is it supposed to do this? My total backup size (on old Mac) was few GB and I think stored backup on Tarsnap is 1-2GB. I don't see any ETA either.

And it gets interrupted, does it start all over again?

Screenshot 2023-10-21 at 6 19 59 AM
gperciva commented 10 months ago

Hi @amarendra, it will take a while, because it's loading metadata about your existing archives (from your old computer).

That said, I would expect it to be 2-10 minutes, not 2-3 hours!

I will investigate for possible causes.

amarendra commented 10 months ago

That said, I would expect it to be 2-10 minutes, not 2-3 hours!

Even this (current) try has taken a solid ~50 mins

I will investigate for possible causes.

Please let me know if I can help. Logs etc (and where I can find them).

I can email you my Tarnsap a/c email if you would like that to see pings from my machine by any chance.

gperciva commented 10 months ago

Ouch. Yeah, 50 minutes is ridiculous. Sorry. :(

Let's try it on the command line, since that has better information. If you open a terminal, and copy your tarsnap keyfile to your directory, then try:

tarsnap --keyfile tarsnap.key --cachedir tarsnap-cachedir --fsck

(fsck is the unix filesystem checker utility, so our "check archives" command is named after it)

On my system (with only a small number of archives), I see:

$ tarsnap --keyfile tarsnap.key --cachedir tarsnap-cachedir --fsck
Directory tarsnap-cachedir created for "--cachedir tarsnap-cachedir"
Phase 1: Verifying metadata validity
Phase 2: Verifying metadata/metaindex consistency
Phase 3: Reading chunk list
Phase 4: Verifying archive completeness
  Archive 1/47...
  Archive 2/47...
  Archive 3/47...
-- I'm cutting this for legibility --
  Archive 46/47...
  Archive 47/47...
Phase 5: Identifying unreferenced chunks

and it takes 56 seconds. I'm sure that you have more than 47 archives, but at least you should see the progress as they go by.

I don't personally have access to the tarsnap server logs, but if this --fsck command doesn't show anything interesting, we can ask Colin to check.

amarendra commented 10 months ago

Tried in cli tarsnap --keyfile tarsnap.key --cachedir tarsnap-cachedir --fsck and that resulted in being stuck at

Phase 1: Verifying metadata validity for almost 10 mins. Then the following steps were finished:

Phase 2: Verifying metadata/metaindex consistency
Phase 3: Reading chunk list
Phase 4: Verifying archive completeness

It verified 1130/1130 archives. (I might have forgotten but I think I had some sort of pruning set up on my last machine.) This step took some 50-60 mins in total - this is because I have too many archives I believe, right?

Then the whole processes terminated on few seconds after archive verification was finished with this printed

Phase 5: Identifying unreferenced chunks There was nothing printed after this.

But either way, it didn't result in anything for 2-3 hours from the GUI (or I wouldn't know whether it was trying albeit slowly maybe).

Also I did see some transfer yesterday in my account. However the machine name is as my old machine name so I believe that is because the key is mapped to that machine name, right?

gperciva commented 10 months ago

Ok, the --fsck succeeded. That's good news!

As for whether 1130 archives is too many... well, it's not necessarily "too many", but the more archives you have, the more time it will take for operations like --fsck and --list-archives to finish. It's completely up to you how many archives you keep!

Yes, your machine name is mapped to your key.

The time it takes in the GUI should be pretty much exactly the same as it took on the command-line; or maybe 0.5 seconds longer. So the GUI is definitely doing something bad.

Let's avoid the GUI's setup wizard. You have the tarsnap cache already, from the --fsck command.

We need to start by showing your user's Library folder:

  1. open a Finder window
  2. navigate to your home directory (or "user directory")
  3. right-click on the background
  4. Go to "Show View Options"
  5. click on "Show Library Folder".

Now we can do the Tarsnap app stuff:

  1. Launch the Tarsnap app. It should be in the "Tarsnap setup".
  2. Click on "Skip wizard".
  3. the main Tarsnap app window will open, and there will be 4 error messages. Click "ok" on all 4 errors.
  4. Click on "Settings"
  5. Next to "Machine key", click on "change", and find your tarsnap.key file.
  6. click on "Application" at the bottom.
  7. set the Tarsnap client directory to /opt/homebrew/bin
  8. the Tarsnap cache directory can be whatever you want, but the default is ~/Library/Caches/Tarsnap Backup Inc./Tarsnap
  9. Go back to the command-line, and do: cp -a ~/tarsnap-cachedir/* "~/Library/Caches/Tarsnap Backup Inc./Tarsnap/" This will get the results of your--fsck` command into the GUI, so that we don't need do to it again.
  10. Back to the GUI, and set up the App data directory; it can be whatever you want, but the default is ~/Library/Application Support/Tarsnap Backup Inc./Tarsnap
  11. close and re-open the Tarsnap app. It should now go through the "Updating archives list from remote..."
  12. at a rough guess, the "Updating archives list" should be 10 times faster than the --fsck step.

I apologize for all this trouble.

amarendra commented 10 months ago

It crashed on step 11 when I closed the app. Sharing log tarsnap.crash.log. (Note: I don't recall when was the last time tarsnap-gui crashed on me, at least I didn't notice it - but then I was using it on Catalina so far).

But when I restarted 12 step was as expected.

Updating archives list from remote took 12 mins.

But "updating archives from remote" happened again when I started Tarsnap and it took as long. Even if I just close the app after archive list update was done once and I merely restart the app again it takes that much time. Is it expected - due to my number of archives? Anyway to improve this (other than reducing the archive numbers which I will).

Here's the exact time:
[23/10/23 10:17 AM] ==Session start==
[23/10/23 10:17 AM] Updating archives list from remote...
[23/10/23 10:29 AM] Updating archives list from remote... done.
[23/10/23 10:35 AM] ==Session end==
[23/10/23 10:35 AM] ==Session start==
[23/10/23 10:35 AM] Updating archives list from remote…
[23/10/23 10:47 AM] Updating archives list from remote... done.

When I tried to trigger backup for a job I got this warning Some backup paths for Job <job name> are not accessible anymore and thus backup may be incomplete. Proceed with backup? I stopped and didn't attempt a backup because I have given full discuss to Tarsnap.app.

My source list is completely inside ~/ and few in ~/Library/ which I was able to backup earlier (on Catalina) just fine.

I remember facing this on my last Mac as well and giving Full Disk Access fixed it. Anyway I tried running a backup (even with this "Some backup paths…" warning) and it failed

[23/10/23 1:46 PM] Backup <job_name>_2023-10-23_13-46-41 queued.
[23/10/23 1:46 PM] Backup J<job_name>_2023-10-23_13-46-41 is running.
[23/10/23 1:47 PM] Backup <job_name>_2023-10-23_13-46-41 failed: tarsnap: : Cannot stat: No such file or directory

And this is the log from the log file where Tarsnap has written logs:

Task {<task uuid redcated>} started:
[/opt/homebrew/bin/tarsnap --no-default-config --keyfile /<user>/DirAbc/Config/git-ignored/tarsnap/tarsnap-key.key --cachedir /<user>/.cache/tarsnap --aggressive-networking -P -L --creationtime 1698049463 --quiet --print-stats --no-humanize-numbers -c -f Job_DirAbc_tarsnap_job_name_2023-10-23_13-54-23 --exclude .DS_Store --exclude .localized --exclude .fseventsd --exclude .Spotlight-V100 --exclude ._.Trashes --exclude .Trashes  /<user>/.config/tarsnap/tarsnap.db /<user>/.config/karabiner '/<user>/DirAbc/Some Folder' '/<user>/Library/Application Support/AddressBook/Sources' /<user>/.ssh '/<user>/Library/Mobile Documents/com~apple~CloudDocs/DirXyz' /<user>/.notes /<user>/DirAbc/Config /<user>/Library/Messages/chat.db /<user>/.dotfiles]Task {<task uuid redcated>} finished with exit code 1:
[/opt/homebrew/bin/tarsnap --no-default-config --keyfile /<user>/DirAbc/Config/git-ignored/tarsnap/tarsnap-key.key --cachedir /<user>/.cache/tarsnap --aggressive-networking -P -L --creationtime 1698049463 --quiet --print-stats --no-humanize-numbers -c -f Job_DirAbc_tarsnap_job_name_2023-10-23_13-54-23 --exclude .DS_Store --exclude .localized --exclude .fseventsd --exclude .Spotlight-V100 --exclude ._.Trashes --exclude .Trashes  /<user>/.config/tarsnap/tarsnap.db /<user>/.config/karabiner '/<user>/DirAbc/Some Folder' '/<user>/Library/Application Support/AddressBook/Sources' /<user>/.ssh '/<user>/Library/Mobile Documents/com~apple~CloudDocs/DirXyz' /<user>/.notes /<user>/DirAbc/Config /<user>/Library/Messages/chat.db /<user>/.dotfiles]
tarsnap: : Cannot stat: No such file or directory
                                       Total size  Compressed size
All archives                         673869685257     647723158684
  (unique data)                        4724341500       2009564788
This archive                            709225866        685335132
New data                                   297652           182598
tarsnap: Error exit delayed from previous errors.

From the log and the way I am used to read it it seem to have not been able to find "tarsnap" itself which is weird because it is present at /opt/homebrew/bin/tarsnap (because in setting the client dir is /opt/homebrew/bin and GUI is able to detect client version 1.0.40 there) and points to tarsnap -> ../Cellar/tarsnap/1.0.40_1/bin/tarsnap.

I am stopping the debugging at this point. Please let me know what else I have to do.


PS. Additionally could you please point me to a guide/tutorial where I can prune old archives? My plan is to keep certain annnual, weekly, daily numbers etc - but at the same time not loose any file. i.e if there is a file that is in just one archive then is there a way pruning would avoid deleting it? I am not sure I am clear on this. Also, if this issue is not the right place for this I can send this query in the mailing list (though this is also publicly available).

gperciva commented 10 months ago

Thanks for the crash log! I'm surprised that it crashed there, so this is giving me some ideas for things to add to the test suite. (I've recorded this as #564.)

Unfortunately, there's no way to make the "updating archives from remote" faster with the current official release of the tarsnap CLI and GUI. (But I agree that 12 minutes is an unacceptable amount of time to wait to launch an app!)

With the current tarsnap CLI and GUI, I would guess that this step will take around 5 seconds + 0.6 seconds per archive, for your computer and internet connection. So if you reduced your archives to 100, I would expect it to take approximately 1 minute. (But I agree that this is still too long to be a good user experience!) (#565)

Regarding your backup, it seems that the GUI gets confused if some paths don't exist. For example, if your new computer doesn't have a directory called "~/Library/Mobile Documents/com~apple~Clouddocs/DirXyz", then all bets are off.

Unfortunately, the GUI does not handle this at all well. I can't see any way to modify an existing job to remove non-existent paths. And at the very least, the warning message should say which path doesn't exist! (I've recorded this problem as #563)

In the log file, the line:

tarsnap: : Cannot stat: No such file or directory

actually means that tarsnap is trying to open a file called ` (a single space) or a 0-length filename. The GUI found the tarsnap command-line utility just fine, so there's no confusion about/opt/homebrewor../Cellar/tarsnap`

In this case, I think it would be faster to get it working if you deleted the previous Job, and made a new Job. Based on the log file extract, you were trying to back up:

/<user>/.config/tarsnap/tarsnap.db
/<user>/.config/karabiner
'/<user>/DirAbc/Some Folder'
'/<user>/Library/Application Support/AddressBook/Sources'
/<user>/.ssh
'/<user>/Library/Mobile Documents/com~apple~CloudDocs/DirXyz'
/<user>/.notes
/<user>/DirAbc/Config
/<user>/Library/Messages/chat.db
/<user>/.dotfiles

As for pruning old archives automatically, that's something that was mentioned in #34, but as you've seen in great detail, there's bigger problems in the GUI right now. Our website lists a few third-party command-line scripts which can help with this: https://www.tarsnap.com/helper-scripts.html But unfortunately I have no firsthand experience with them, so I can't recommend one vs. the others. If you wanted to ask for advice from people who use such scripts, the tarsnap-users mail list would be the perfect place.

amarendra commented 10 months ago

So if you reduced your archives to 100, I would expect it to take approximately 1 minute. (But I agree that this is still too long to be a good user experience!) (#565)

Yes, indeed it is too long.

Regarding your backup, it seems that the GUI gets confused if some paths don't exist. For example, if your new computer doesn't have a directory called "~/Library/Mobile Documents/com~apple~Clouddocs/DirXyz", then all bets are off.

Everything is present. Vorta backs up the same set (+ some more) and it backed up all that just fine twice yesterday and once again just now. In fact yesterday Vorta tried and backed up with a warning because it could not find ~/Library/Messages/Archive as it doesn't exist in Sonoma. But it still backed up the rest. (No this is not in one of the sources in tarsnap's list - everything exists for tarsnap)

Don't you think it's a critical bug is tarsnap just fails the entire backup if it could not find one of the sources? And no, there's no source that doesn't exist. I had added everything again yesterday via GUI. I could have missed a quote or space or escape character if there was a config file (not sure there is one) but I added them all in the GUI.

Unfortunately, the GUI does not handle this at all well. I can't see any way to modify an existing job to remove non-existent paths. And at the very least, the warning message should say which path doesn't exist! (I've recorded this problem as #563)

There are no non-existent paths, I again cross verified it. So there is no way to know what exactly is failing? Is there way to read the sources list in some config file somewhere so that there I can see where that "empty space" source is being picked from? Because as you might know tarsnap-gui sources are decided from the UI where you select the paths so if something doesn't exist it was not selected, right?

actually means that tarsnap is trying to open a file called ` (a single space) or a 0-length filename. The GUI found the tarsnap command-line utility just fine, so there's no confusion about/opt/homebrewor../Cellar/tarsnap`

As said above there is no such path. Or can you point to where the config file is so that I can try seeing there or maybe a db entry where I can see whether there is a config file.

In this case, I think it would be faster to get it working if you deleted the previous Job, and made a new Job. Based on the log file extract, you were trying to back up:

I did not delete the previous job. I kept it because if am to use Tarsnap I will be using it with the GUI and would like to get it debugged why it is happening because imho it is not a trivial issue. I am here to provide you every log or information you need.

Having said that - I did add a new job and added the same sources and it seems backup happened smoothly and real fast.

[25/10/23 11:31 AM] Job <new job name> added.
[25/10/23 11:31 AM] Backup Job_<new job name>_2023-10-25_11-31-48 queued.
[25/10/23 11:31 AM] Backup Job_<new job name>_2023-10-25_11-31-48 is running.
[25/10/23 11:32 AM] Backup Job_<new job name>_2023-10-25_11-31-48 completed. (4.37 MB new data on Tarsnap)
[25/10/23 11:32 AM] Fetching contents for archive Job_<new job name>_2023-10-25_11-31-48...
[25/10/23 11:32 AM] Fetching contents for archive Job_<new job name>_2023-10-25_11-31-48...
[25/10/23 11:32 AM] Fetching contents for archive Job_<new job name>_2023-10-25_11-31-48... done.
[25/10/23 11:32 AM] Fetching contents for archive Job_<new job name>_2023-10-25_11-31-48... done.

It has exactly the same sources selected as the erroneous job (I took screenshots to compare). Again, if there is a config lying somewhere on the disk it would be good to compare.

The detailed log written on the disk wasn't really helpful it had few -- 2044 output lines truncated by Tarsnap GUI -- which is fine but this was the last line in the log also. Basically no finally summary of what all happened. Besides I don't see timestamp in the log written on the disk - that might be helpful. I am not saying there some technical problem with that but all this honestly feels too barebone and some kind of assembled thing that somehow works. Now if that's the intention and GUI is just a convenience and not a "supported" app that I understand and respect your choice. Is that so?

As for pruning.... mentioned in #34, but as you've seen in great detail,

I will check that.

there's bigger problems in the GUI right now.

Yes that is kinda concerning seeing some things have been around for multiple years if I am reading it correctly.

Our website lists a few third-party command-line scripts which can help with this: https://www.tarsnap.com/helper-scripts.html But unfortunately I have no firsthand experience with them, so I can't recommend one vs. the others.

I will steer clear of them then :)

If you wanted to ask for advice from people who use such scripts, the tarsnap-users mail list would be the perfect place.

I will try that.

Thank you.

amarendra commented 10 months ago

As for pruning old archives automatically, that's something that was mentioned in https://github.com/Tarsnap/tarsnap-gui/issues/34,

Checked it. There isn't really there anything about it or in the issue that is linked from there. So I guess nothing? Maybe I will try tarsnap-users by creating a throwaway email then as it doesn't really work with hidemyemail.

edit: Looks like https://www.google.com/search?q=tarsnap+pruning will have something I can start with and then mailing list if I am still clueless (and I will be clueless).

gperciva commented 10 months ago

Don't you think it's a critical bug is tarsnap just fails the entire backup if it could not find one of the sources?

Depending on your point of view, the situation is either better or worse than you imagine.

The good news: the command-line tarsnap utility does exactly the right thing: it backs up as much as possible, prints a warning about about any missing paths, and quits with exit code 1 to indicate that there was a problem. The idea is that the user can then investigate why there was a missing path.

The bad news: the GUI sees that something went wrong, and fails to add the archive to its list. (I've added this as #566)

To make it weirder: if you close the app and open it again, the GUI discovers the newly-created archive.

all this honestly feels too barebone and some kind of assembled thing

I agree.

Now if that's the intention and GUI is just a convenience and not a "supported" app that I understand and respect your choice. Is that so?

The intention is that the GUI will eventually be a useful and high-quality tool. It's clearly not there yet.

The tarsnap "getting started" page states "There is no graphical user interface". If we thought that the GUI was in good shape, we would mention it on tarsnap.com.

Now, I said "eventually". Can I give an estimate of when I think the GUI will be useful? Unfortunately, I cannot.

For example, as we've discussed, waiting for 12 minutes on app startup is completely unacceptable; even waiting 1 minute is too long. Improving this cannot be done in the GUI by itself; it will require changing the command-line and possibly the server.

So there is no way to know what exactly is failing? Is there way to read the sources list in some config file somewhere so that there I can see where that "empty space" source is being picked from?

The GUI definitely needs a way to display what is failing. https://github.com/Tarsnap/tarsnap-gui/commit/d0c40ebc7bff1e9adbcd5c1e0cdcc1922191ec2e is a quick hack that does this, but that method doesn't allow people to un-select any missing paths, so it's not a complete fix.

If you are very curious and can browse SQL files, you could take a look at tarsnap.db in the App data directory. On FreeBSD, I use sqlitebrowser; it looks like you can get that on macos with brew install --cask db-browser-for-sqlite.

Once you've opened the SQL database, click on "Browse Data", then change the table to "jobs". If you click on the "urls" cell of the appropriate row, you should see the paths that the job is trying to archive.

But at this point, I don't think it would really help. I mean, since the GUI currently does not have a way to un-select any garbage in the list of files, then knowing that it has a bad filename would satisfy curiosity, but that wouldn't help with making archives.

Our website lists a few third-party command-line scripts... But unfortunately I have no firsthand experience with them, so I can't recommend one vs. the others.

I will steer clear of them then :)

Oh, I don't think you need to avoid them. It's more that because I'm an employee of Tarsnap, I cannot endorse (or un-endorse) any one of them.

Hmm... it's fair for me to say this: Michael W. Lucas published "Tarsnap Masters" in 2015, and his book mentions Feather, Tarsnapper, and ACTS. On page 103, he says that he recommends ACTS for routine use.

I will check that.

(about pruning archives, and #34)

Ah, sorry for not being clear. What I meant to say is "it's a requested feature, and it's on the list, but not implemented yet."

amarendra commented 10 months ago

I had actually used sqlite browser to look at the tarsnap.dp and I had totally ignored the urls column that it could be the sources list. My bad.

And yes the old job has urls as that was what visible from a quick look that is why I think I didn't click to check cell contents further. When I looked at the table again today I did see file:///... something for the urls in the new job's row.

The old job's urls has an empty line for its first line somehow. So that is there. In case you need/want to debug this - if you need more info - do let me know otherwise I will just delete the old job.

So just to make it clear for myself - there is no pruning as of now? Not even a convoluted (for me) script way? Because 10 mins and 1000+ archives seem really too much as you mentioned.

I am going to look at these and maybe give them a try. As long as I don't have to maintain a more than one script for tarsnap and few moving parts to maintain schedulers and all and can keep seeing regular - this succeeded, this failed, this happened with with this warning without having to oil the machinery every week I am good. (Lost all interested in tinkering with cli/etc, especially for things like personal data backup, after college days and days of various distros were over. Guess I am not paranoid enough ;))

There is no graphical user interface

That's a bummer really :) But thank you for debugging with me. I will hit the mailing list if I get stuck with those "helper scripts" or if I need to find a way for pruning the way I want to do.

(Though I really wish there was a way communicate on the mailing list without exposing emails for harvesting in plain text - or a more classical mailman kind of way where I didn't have to send an email so I could easily use something like HideMyEmail, or even Github discussions).

Thanks again for your patience. Really appreciate it.

gperciva commented 10 months ago

Thanks for checking the url list! I've added it as #567. (As always, no guarantee about when it'll be fixed; but at least it's recorded.)

As for pruning: the GUI can delete individual archives, but there's no automatic pruning yet, sorry.

If you want to do it on the command line, there's two options: 1) If there's only a few archives you want to delete, you can specify multiple ones at once on the command-line: https://www.tarsnap.com/improve-speed.html#faster-delete

2) If there's many archives you want to delete, you can put them in a text file. For example:

$ more deleteme.txt 
foo
bar
$ tarsnap -d --archive-names deleteme.txt 

That will delete the archives foo and bar. Of course, you'll want to double-check the archive names that you'll be deleting!

Granted, I hear you about not wanting to mess around with the CLI for backups. Unfortunately, at the moment it's the most reliable way to use tarsnap. I'm just trying to give you as much information as possible, so you can decide how to proceed.

As for the mailing list, unfortunately I'm not knowledgeable about the software and options out there for running a list. But if you'd prefer to avoid your email address going there, you could send the question to me privately, and I'll send the question to the list for you.

(Anybody gathering email addresses from the list will already have mine!)