gauteh / lieer

Fast email-fetching and sending and two-way tag synchronization between notmuch and GMail
http://lieer.gaute.vetsj.com
Other
497 stars 60 forks source link

Updating tags on non-existent file #75

Closed learmj closed 6 years ago

learmj commented 6 years ago

After performing a sync of my pretty large mailbox, when updating tags, I consistently see an exception about failing to update tags on a non-existent file. I think gmailieer may have somehow got the file name wrong, but I'm not sure... Here's the trace: ... resolving changes: 100% receiving content: 100% updating tags (0Δ): 0%| | 0/13602 [00:00<?, ?it/s]Traceback (most recent call last): File "/usr/local/bin/gmi", line 8, in g.main () File "/b/git/gmailieer/lieer/gmailieer.py", line 145, in main args.func (args) File "/b/git/gmailieer/lieer/gmailieer.py", line 215, in sync self.pull (args, True) File "/b/git/gmailieer/lieer/gmailieer.py", line 341, in pull self.partial_pull () File "/b/git/gmailieer/lieer/gmailieer.py", line 492, in partial_pull r = self.local.update_tags (m, None, db) File "/b/git/gmailieer/lieer/local.py", line 401, in update_tags raise Local.RepositoryException ("tried to update tags on non-existant file: %s" % fname) lieer.local.RepositoryException: tried to update tags on non-existant file: /path/to/toplevelmaildir/mail/cur/162f6b5441a3ab90:2,

The file "/path/to/toplevelmaildir/mail/cur/162f6b5441a3ab90:2," does not exist in the mail store, but file "/path/to/toplevelmaildir/mail/cur/162f6b5441a3ab90:2,S" does.

Could it be possible that gmailieer has had a problem parsing this file name?

notmuch search for "/path/to/toplevelmaildir/mail/cur/162f6b5441a3ab90:2,S" correctly identifies the email and shows the tags applied to it. The message only has one mime part which is text/html (no text/plain part).

Also, is there a particular reason why this exception hangs gmailieer? I think it would be better if it exited. -- Matt

gauteh commented 6 years ago

Are you by any chance reading your email while gmi is syncing? It seems that the 'S' (seen) flag is added to a message. This will happen if the message has been read by a MUA and the maildir flags have been synced. No changes to the messages should be made while gmi is syncing.

I don't see how it can hang.. it should either exit or just continuing to sync the next messages.

Matthew Lear writes on april 24, 2018 11:40:

After performing a sync of my pretty large mailbox, when updating tags, I consistently see an exception about failing to update tags on a non-existent file. I think gmailieer may have somehow got the file name wrong, but I'm not sure... Here's the trace: ... resolving changes: 100% receiving content: 100% updating tags (0Δ): 0%| | 0/13602 [00:00<?, ?it/s]Traceback (most recent call last): File "/usr/local/bin/gmi", line 8, in g.main () File "/b/git/gmailieer/lieer/gmailieer.py", line 145, in main args.func (args) File "/b/git/gmailieer/lieer/gmailieer.py", line 215, in sync self.pull (args, True) File "/b/git/gmailieer/lieer/gmailieer.py", line 341, in pull self.partial_pull () File "/b/git/gmailieer/lieer/gmailieer.py", line 492, in partial_pull r = self.local.update_tags (m, None, db) File "/b/git/gmailieer/lieer/local.py", line 401, in update_tags raise Local.RepositoryException ("tried to update tags on non-existant file: %s" % fname) lieer.local.RepositoryException: tried to update tags on non-existant file: /path/to/toplevelmaildir/mail/cur/162f6b5441a3ab90:2,

The file "/path/to/toplevelmaildir/mail/cur/162f6b5441a3ab90:2," does not exist in the mail store, but file "/path/to/toplevelmaildir/mail/cur/162f6b5441a3ab90:2,S" does.

Could it be possible that gmailieer has had a problem parsing this file name?

notmuch search for "/path/to/toplevelmaildir/mail/cur/162f6b5441a3ab90:2,S" correctly identifies the email and shows the tags applied to it. The message only has one mime part which is text/html (no text/plain part).

Also, is there a particular reason why this exception hangs gmailieer? I think it would be better if it exited. -- Matt

-- You are receiving this because you are subscribed to this thread. Reply to this email directly or view it on GitHub: https://github.com/gauteh/gmailieer/issues/75

learmj commented 6 years ago

I agree that would be a possibility, but this email was not one that I searched for or read so no - I never read it (I use the gmail web UI and notmuch-emacs as mail readers). There is most definitely a hang when I see the error message about applying a tag to a non existent file, though. I periodically invoke gmi sync from a bash script with time stamp prints on either side. A sync started at 1938 last night, did stuff and then locked up with this message (this time on another message). I re-ran the sync again and it seems fine. I have also seen the message about applying a tag to a non existent file when doing a gmi push and that definitely exits, but when I see it doing a gmi sync, it seems to hang.

gauteh commented 6 years ago

Ok, can you try to reproduce the bug with --debug on. Perhaps the file got changed earlier by gmi on and the cache wasn't updated.

Matthew Lear writes on april 25, 2018 9:57:

I agree that would be a possibility, but this email was not one that I searched for or read so no - I never read it (I use the gmail web UI and notmuch-emacs as mail readers). There is most definitely a hang when I see the error message about applying a tag to a non existent file, though. I periodically invoke gmi sync from a bash script with time stamp prints on either side. A sync started at 1938 last night, did stuff and then locked up with this message (this time on another message). I re-ran the sync again and it seems fine. I have also seen the message about applying a tag to a non existent file when doing a gmi push and that definitely exits, but when I see it doing a gmi sync, it seems to hang.

-- You are receiving this because you commented. Reply to this email directly or view it on GitHub: https://github.com/gauteh/gmailieer/issues/75#issuecomment-384194907

learmj commented 6 years ago

Ok. Will do. I just hit another exception which also seems to have caused a hang. Not sure if it's related. Traceback (most recent call last): File "/usr/local/bin/gmi", line 8, in g.main () File "/b/git/gmailieer/lieer/gmailieer.py", line 145, in main args.func (args) File "/b/git/gmailieer/lieer/gmailieer.py", line 215, in sync self.pull (args, True) File "/b/git/gmailieer/lieer/gmailieer.py", line 341, in pull self.partial_pull () File "/b/git/gmailieer/lieer/gmailieer.py", line 471, in partial_pull updated = self.get_content (message_gids) File "/b/git/gmailieer/lieer/gmailieer.py", line 622, in get_content self.remote.get_messages (need_content, _got_msgs, 'raw') File "/b/git/gmailieer/lieer/remote.py", line 110, in func_wrap return func (self, *args, **kwargs) File "/b/git/gmailieer/lieer/remote.py", line 330, in get_messages cb (msg_batch) File "/b/git/gmailieer/lieer/gmailieer.py", line 620, in _got_msgs self.local.store (m, db) File "/b/git/gmailieer/lieer/local.py", line 344, in store raise Local.RepositoryException ("local file already exists: %s" % p) lieer.local.RepositoryException: local file already exists: /path/to/toplevelmaildir/mail/cur/162d00e6088ff9b6:2,S

gauteh commented 6 years ago

Matthew Lear writes on april 25, 2018 9:57:

I agree that would be a possibility, but this email was not one that I searched for or read so no - I never read it (I use the gmail web UI and notmuch-emacs as mail readers).

Using the gmail web UI or the notmuch-emacs would cause the message to be read and the flags to change. There are other cases where this bug could happen, but if there's any chance you've read then that is very useful information.

The second trace you just posted might be a related bug. Probably a flag mismatch.. Is there any chance you have to gmi instances running simultanuously? That, or other changes to the db (using emacs client or CLI tool), could cause a hang.

learmj commented 6 years ago

I'm consistently hitting the 'tried to update tags on non-existant file' problem today. I'm reading mail with notmuch-emacs while a bash script runs 'gmi --logging_level DEBUG sync' every two minutes. I'm not seeing any additional debug printed from gmi, and the sync still hangs. I'm ok with the exception and I understand why it happens, but it really shouldn't hang... Is there a different (correct?) way to get more debug output?

gauteh commented 6 years ago

Matthew Lear writes on april 26, 2018 15:31:

I'm consistently hitting the 'tried to update tags on non-existant file' problem today. I'm reading mail with notmuch-emacs while a bash script runs 'gmi --logging_level DEBUG sync' every two minutes. I'm not seeing any additional debug printed from gmi, and the sync still hangs. I'm ok with the exception and I understand why it happens, but it really shouldn't hang... Is there a different (correct?) way to get more debug output?

Err.. right so there isn't a way to turn on more debug output.. was mixing up with some other project.

Ok, so it is probably caused by changing the read state through the emacs interface?

It definetely shouldn't hang, if you ctrl+C, what is the backtrace you get? That should tell us where it is hanging.

It would be possible to re-scan the maildir if there are inconsitensies with the file names (it is a bit expensive, but shouldn't happen too often). If it fails a second time it means that the file has been deleted, which is not supported. A second pull should fix that however.

gauteh commented 6 years ago

Also; you could try to turn off the progressbar. Maybe its just the progressbar that messes up the TTY and it therefore seems like it hangs?

gauteh commented 6 years ago

I don't think the second backtrace is related. It might be caused by multiple gmi processes.

gauteh commented 6 years ago

Does #76 fix your issue?

gauteh commented 6 years ago

77 should fix multiple gmi processes issue.

learmj commented 6 years ago

Ok. Nothing from ctrl+c. Good call on trying with no progress bar. I'll give it a go... I definitely didn't/don't have multiple gmi processes running. It's always run in one shell via a script and I even have the following in a script function which then invokes gmi (which I think should prevent multiple processes running [GMI = /usr/local/bin/gmi]):

pid=$(pgrep -f $GMI) if [[ ${pid} -gt 0 ]] ; then echo "gmi is running with pid ${pid}" return fi

If I continue to see it I'll merge #76 and re-test. Thanks.

learmj commented 6 years ago

Ok, so when using 'gmi sync -s' there is no hang when the "tried to update tags on non-existant file" error occurs.

gauteh commented 6 years ago

Ok. The progressbar cleanup could probably be better, but don't think I'm going to prioritize that now. I'll merge #76 and this bug should go away anyway.

learmj commented 6 years ago

Ok. I've moved to 4ad1392 and re-enabled the progress bar so will see how it goes. Bit concerned about "If you make changes to the local repository (tags) while syncing you might loose changes." though... I guess this is a result of the MUA (such as notmuch-emacs) and Gmailieer accessing the notmuch database? If the database is locked, can't Gmailieer detect that and re-try [next time]?

learmj commented 6 years ago

Still broken...and still hangs...

updating tags (0Δ): 0%| | 0/1 [00:00<?, ?it/s]missing file: reloading cache to check for changes..done. Traceback (most recent call last): File "/usr/local/bin/gmi", line 8, in g.main () File "/b/git/gmailieer/lieer/gmailieer.py", line 145, in main args.func (args) File "/b/git/gmailieer/lieer/gmailieer.py", line 215, in sync self.pull (args, True) File "/b/git/gmailieer/lieer/gmailieer.py", line 341, in pull self.partial_pull () File "/b/git/gmailieer/lieer/gmailieer.py", line 492, in partial_pull r = self.local.update_tags (m, None, db) File "/b/git/gmailieer/lieer/local.py", line 408, in update_tags raise Local.RepositoryException ("tried to update tags on non-existant file: %s" % fname) lieer.local.RepositoryException: tried to update tags on non-existant file: /path/to/toplevelmaildir/mail/cur/16306c8365d5230f:2,

gauteh commented 6 years ago

Matthew Lear writes on april 27, 2018 13:26:

Still broken...and still hangs...

My bad, didn't reload filename - please try #78.

gauteh commented 6 years ago

Matthew Lear writes on april 27, 2018 12:21:

Ok. I've moved to 4ad1392 and re-enabled the progress bar so will see how it goes. Bit concerned about "If you make changes to the local repository (tags) while syncing you might loose changes." though... I guess this is a result of the MUA (such as notmuch-emacs) and Gmailieer accessing the notmuch database? If the database is locked, can't Gmailieer detect that and re-try?

Actually, the point as I put it there is just confusing and inaccurate. The paragraph about syncing [0] already covers this. Unless you use the '-f' flag any conflicting remote changes will take presedence over local changes. Any changes on the remote or the local while gmi is running should not have any effect (except if remote changes are made to a message which is currently being synced to the remote, the remote changes may be overwritten). Any other changes will be picked up in the next sync. I will remove the point from caveats.

[0] https://github.com/gauteh/gmailieer#normal-synchronization-routine

gauteh commented 6 years ago

Gaute Hope writes on april 27, 2018 15:09:

I will remove the point from caveats.

Check #78 for updates to the README as well.

learmj commented 6 years ago

Cool. I've switched to eb3b8b1 on origin/reload-cache-2 and re-enabled the progress bar. I'll let you know what I see...

learmj commented 6 years ago

I'm sure that since moving to this, messages in my 'inbox view' in notmuch emacs (messages which have the inbox tag) does not match my Inbox in Gmail. Up until very recently, once a gmi sync had completed, the result of notmuch search tag:inbox was identical to what I saw after clicking on the Inbox label in Gmail. It's not the case any more... Looking at the code diffs between #78 and what I was at prior to changing (4ad1392) I do not see why this would cause such behaviour.

gauteh commented 6 years ago

Matthew Lear writes on april 27, 2018 18:25:

I'm sure that since moving to this, messages in my 'inbox view' in notmuch emacs (messages which have the inbox tag) does not match my Inbox in Gmail. Up until very recently, once a gmi sync had completed, the result of notmuch search tag:inbox was identical to what I saw after clicking on the Inbox label in Gmail. It's not the case any more... Looking at the code diffs between #78 and what I was at prior to changing (4ad1392) I do not see why this would cause such behaviour.

Are you sure that you have run gmi sync enough times after you've made changes on either side?

gauteh commented 6 years ago

Gaute Hope writes on april 27, 2018 18:59:

Are you sure that you have run gmi sync enough times after you've made changes on either side?

Also, try to fetch changes anew, I've merged the single-process branch to master so that we can exclude that.

gauteh commented 6 years ago

Without locks on the notmuch database by gmaileer and any other notmuch tool modifying the db it won't be possible to safeguard against concurrent changes being lost out from the sync, without doing a fair bit of redundant checking. The revision of the database is acquired after we have made the changes to the local db in a pull. If during this time, other changes have been made, the revision will also be above these messages.

The only method I can think of is to use the revision from before the sync, but this will cause all messages to be rechecked again at the next sync. It shouldn't be too expensive on incremental syncs, but still.

gauteh commented 6 years ago

I have avoided locks because gmailieer could run for a long time and this might be disruptive, might have to change that.. :/

gauteh commented 6 years ago

Actually, we are already using the revision from before the sync (the push). Partial pull does not update the lastmod.

gauteh commented 6 years ago

@stdbull did you have a chance to test with the new locked gmi instance as well as running gmi sync multiple times to ensure that you have matching states? Note that you can do gmi pull -f to pull down all tags from remote again if you think the states are inconsistent (this will take a long time, but no content will be re-fetched, so not nearly as long as initial pull).

learmj commented 6 years ago

Hi. Sorry for the delay... I checked out 36b4fc9 yesterday morning and have been using things in my 'normal flow' since then. I've not seen any occurrences of the 'updating tags on non-existent file' error. I do still seem to have a discrepancy between Gmail's Inbox view and my notmuch search tag:inbox view. I might run a pull -f overnight to see if it re-syncs things but I'm not too concerned about that right now.

gauteh commented 6 years ago

Ok, thanks. Please let me know how it works out.

gauteh commented 6 years ago

Closing, since the original issue is fixed either way. Please open a new issue if you can confirm a sync issue.