ProtonMail / proton-bridge

Proton Mail Bridge application
GNU General Public License v3.0
1.14k stars 152 forks source link

Searching older email under Thunderbird mangles messages. #423

Open petrmanek opened 11 months ago

petrmanek commented 11 months ago

The main symptom is that when searching for older email, I run across perfectly good messages interleaved with corrupted ones. A corrupted message looks like any other message (e.g. it has a perfectly sane header), however when I open it, its body is completely unreadable. More often than not, the corrupted body resembles subset of a body of some other, seemingly arbitrarily selected message -- this usually appears as some HTML excerpt without beginning and end. More often than not, this happens to older messages which have attachments.

Expected Behavior

When opening an older message, see its body as it appears in ProtonMail web interface.

Current Behavior

Instead I see a seemingly randomly chosen subset of some other message. This is what it looks like:

20230919_10h36m06s_grim

Possible Solution

Uncertain. I would try finding out something that all corrupted messages have in common. That would probably provide a hint about how this corruption happens and why.

Steps to Reproduce

  1. Set up ProtonMail Bridge and Thunderbird with a mailbox containing at least several months' worth of email.
  2. Open Thunderbird and search for some messages (e.g. using Ctrl+K hotkey).
  3. Find some older messages with attachments in the listing.
  4. Double-click messages to see their contents.

Version Information

Context (Environment)

Here are only relevant variables:

LANG=cs_CZ.UTF-8
LC_MESSAGES=C
LC_TIME=cs_CZ.UTF-8

Detailed Description

This bug is erratic, and I do not know how easy it will be to reproduce for anybody else. However, I have been running ProtonMail-Bridge with Thunderbird for several years now, and it happens in 80% of all my email searches, which is why I am severely tempted not to search for anything in my email history unless I am logged in via the web interface. I am uncertain whether the error is on the side of the Bridge or Thunderbird, but I figure that the developers should at least know about this behavior. Perhaps there are other users who experience this as well.

As far as I understand the messages are not corrupted online, just inside Thunderbird.

UPDATE: I just looked in the body of a corrupted message that I exported as .eml file. And it appears that there are actually several messages in thee, concatenated including headers and bodies. Interestingly, all messages appear to have been received around the same date. For instance, grep ^Date: corrupted_message.eml yields:

Date: Fri, 08 Sep 2023 21:44:04 +0000
Date: Wed, 13 Sep 2023 16:01:52 +0200
Date: Tue, 12 Sep 2023 22:04:22 +0200
Date: Wed, 13 Sep 2023 10:20:17 +0000
Date: Wed, 13 Sep 2023 09:58:28 +0000
Date: Tue, 12 Sep 2023 13:49:01 -0700
Date: Wed, 13 Sep 2023 14:13:37 +0200
Date: Wed, 13 Sep 2023 15:00:37 +0200
Date: Wed, 13 Sep 2023 03:49:44 -0700
Date: Wed, 13 Sep 2023 03:23:57 -0700
Date: Wed, 13 Sep 2023 02:18:34 -0700
Date: Tue, 12 Sep 2023 22:37:32 -0700
Date: Tue, 12 Sep 2023 11:00:00 +0200
Date: Tue, 12 Sep 2023 11:33:34 +0200
Date: Tue, 12 Sep 2023 15:35:14 +0200
Date: Tue, 12 Sep 2023 03:40:24 -0700
Date: Tue, 12 Sep 2023 14:40:38 +0200
Date: Tue, 12 Sep 2023 06:00:01 +0200
Date: Tue, 12 Sep 2023 07:30:37 +0200
Date: Mon, 11 Sep 2023 21:50:18 +0100
Date: Mon, 11 Sep 2023 16:00:42 -0700

Possible Implementation

No idea.

LBeernaertProton commented 11 months ago

Hey @petrmanek thanks for the report.

Did you continuously use the same bridge set up in a sequence of updates? E.g: start with v3.0 and update all the way through v3.4.x?

Have you tried removing and adding your account and see if the issue is still there? Before you do you can backup the ~/config/protonmail and ~/.local/share/protonmail folders if you wish to restore this state later.

petrmanek commented 11 months ago

Did you continuously use the same bridge set up in a sequence of updates? E.g: start with v3.0 and update all the way through v3.4.x?

Yes, I am running a bleeding-edge system, so whenever a new Bridge is released I usually update to it within a day or two.

Have you tried removing and adding your account and see if the issue is still there? Before you do you can backup the ~/config/protonmail and ~/.local/share/protonmail folders if you wish to restore this state later.

I have done that once or twice in the past, but I can try that again at some point this week. The only drawback of that is time -- my inbox is huuuge and takes several hours to sync. I will let you know how it went.

petrmanek commented 11 months ago

Side note:

The reason I filed this bug report today is that I was looking earlier for a specific message with an attachment in Thunderbird. I correctly found the header of said message, but when I opened it I was presented with all-too-familiar mangled body. Having encountered the bug before, I logged into ProtonMail web interface, found the same message, and retrieved its attachment through my browser.

Here comes the interesting bit. While I was writing this bug report, I left the ProtonMail web interface running in a background tab with the message opened. When I came back to it, the previously opened message was nowhere to be found. It was as if it did not exist in my inbox in the first place. I tried repeating my search several times, and finally found the message again -- but this time it only had its header and body, and was completely stripped of all its attachments. It used to have 2 attached PDF files, now it does not have any. Those 2 files appear to be lost now -- fortunately I backed them up some 30 minutes ago, but this is suspiciously weird behavior that could potentially cost someone their data.

I am a software engineer but I have no idea how to make sense of this. Perhaps the Bridge on my PC somehow tried to resolve a sync conflict by deleting the upstream message, but why would it then copy it back without attachments?

If desirable, I can try fetch some logs from Bridge while they are there. I am afraid that not running any particular debugging/verbose mode, so we may be limited to just conventional log messages. Please let me know if you would find that useful.

LBeernaertProton commented 11 months ago

We only delete messages if:

We are aware that thunderbird has a some delays with the expunge instructions. In the latest thunderbird, we often find ourselves using the compact operation to flush pending expunges. Maybe that's what you saw?

I would recommend, when you have time, to repeat this action on cleanly synced account. It's possible that some old bug managed to stick around during upgrades.

Unfortunately, without knowing the proton message id for this message, looking at the logs would be like finding a needle in haystack :|.

petrmanek commented 11 months ago

We are aware that thunderbird has a some delays with the expunge instructions. In the latest thunderbird, we often find ourselves using the compact operation to flush pending expunges. Maybe that's what you saw?

It is possible. The saga continues: Now, about 20 minutes after my last comment, the message appears in the web interface again with both PDF attachments intact. I think I am going to start making screenshots of this to prove that I am not crazy. :sweat_smile:

I would recommend, when you have time, to repeat this action on cleanly synced account. It's possible that some old bug managed to stick around during upgrades.

Will do. I am currently at work, so I cannot jeopardize my production environment. However, perhaps later in the week/weekend I can try it out.

Unfortunately, without knowing the proton message id for this message, looking at the logs would be like finding a needle in haystack :|.

Is that identical to the Message-Id header? Can I retrieve this somehow using the web interface?

LBeernaertProton commented 11 months ago

Is that identical to the Message-Id header? Can I retrieve this somehow using the web interface?

Yes. It should be visible in the url bar https://mail.proton.me/.../<id>. But it is best to disable conversation mode to get the correct id.

petrmanek commented 11 months ago

Yes. It should be visible in the url bar https://mail.proton.me/.../<id>. But it is best to disable conversation mode to get the correct id.

I think I have it. It looks like a base64-encoded string. What can I do next? Should I grep in ~/.local/state/protonmail/bridge/*/*.log for that ID?

LBeernaertProton commented 11 months ago

If anything is in the logs, you can try a grep by the first 10-12 characters of the ID.

petrmanek commented 11 months ago

If anything is in the logs, you can try a grep by the first 10-12 characters of the ID.

Good, I got something. Don't know if it is useful, though. Note that I had to look in ~/.local/share/protonmail/bridge-v3/logs rather than ~/.local/state/protonmail/bridge.

filtered.log

LBeernaertProton commented 11 months ago

Nothing out of the ordinary I'm afraid. The message was created and received some updates (Flags, label/mbox operations).

LBeernaertProton commented 11 months ago

So far we have not received any reports about corrupted email messages. The only time we experience something similar was due to a problem in an IMAP client that failed to rebuild it's internal cache after UIDValidity change.

petrmanek commented 9 months ago

Since upgrading to v3.6.1 I have not seen symptoms of this issue for a while, so I am closing it for now. If I ever encounter it again, I will reopen it and post more concrete logs.

petrmanek commented 8 months ago

@LBeernaertProton Not sure how to approach this, but I have just observed the issue again. Searching for a USPS email with v3.6.1 and Thunderbird v115.5.0 brought up the following result:

20231203_19h27m56s_grim

As you can clearly see, this email is a weird mix of two messages, one from the US Postal Service (USPS), and other from Fermilab (fnal.gov). While the body appears to belong to the former, the Sender, From, To and Subject seem to belong to the latter. Furthermore, when opened the message body is complete gibberish, possibly part of some HTML message:

20231203_19h33m31s_grim

Further inspection reveals that the body actually appears to belong to a third email, which was an automated message originated from my self-hosted GitLab instance. Interestingly, revealing source code in Thunderbird yields the same view. Where normally the message would begin with email headers, I see the same HTML from the middle of a table (as in the previous screenshot). That being said, scrolling further down reveals headers of an email from USPS.

20231203_19h38m29s_grim

These appear valid and relevant, and are followed by message body, which is then again cut off at a seemingly arbitrary location in the middle of the HTML. This, however, appears only in Thunderbird. All three messages can be found uncorrupted in the webmail app.

I would be happy to provide further details (versions, logs), or run tests upon request. Please advise.

LBeernaertProton commented 8 months ago

@petrmanek Could you repeat the following steps:

Note that this log will contain all the data send between the IMAP client and bridge. If you feel comfortable forwarding this, please send it to my email address.

Alternatively, I just need to inspect the the communication for the SEARCH and FETCH commands and their responses.

petrmanek commented 8 months ago

@LBeernaertProton I have sent you the log file via email.

A couple of observations I made:

This leads me to believe that Thunderbird's search feature is somehow only inquiring its local database, not the Bridge. Would that be your conclusion as well? Can I somehow force Thunderbird to search the server rather than its local cache?

LBeernaertProton commented 8 months ago

In this case, it seems that thunderbird is performing the search.

What I would like you to check is if the fetch response have the garbled emails you see.

EDIT: you also need to set GLUON_LOG_IMAP_LINE_LIMIT=5000 to see the full imap responses.

petrmanek commented 8 months ago

Right, here is what I tried:

I am running out of ideas. Short of configuring Thunderbird to re-download my entire 15 GB inbox, I do not see a viable way to force this behavior to happen again. If only there was a way to force Thunderbird to clear specific email out local cache! Another option would be to run the Bridge continuously in the tracing mode for the foreseeable future and harvest the logs after the next incident.

What do you think?

LBeernaertProton commented 8 months ago

The thing I'm looking for is if the message bodies returned by the FETCH command after the search are not malformed.

Either one of these might be happening:

All my existing tests of my mailbox are working. The only way to determine what is happening on your machine is: