Closed jknockaert closed 1 year ago
Hey @jknockaert . Thanks for the report we will investigate on our end.
OK here's my best guess of what is happening. Upon import the Easy Switch algorithm validates the Date
header and if it decides that it is not properly formatted (as per RFC3501?) it stores the original value to a new 'X-Original-Date' header and reformats the Date
header. So far so good (but it would be nice to have some more documentation on the Easy Switch algorithm).
Then when the Bridge returns the messages it seems to try to evaluate the X-Original-Date
header, and failing to do so it returns a default value, whereas it should just return the header without any further processing.
The issue has been reproduced and has been internally tracked as GODT-2480.
I'm wondering if this could cause a bug in Geary where I say to only sync the last week, but it ended up syncing absolutely everything
hey @primalmotion , sometimes the proton backend can send a refresh event to your account and in that case we need to resync your entire state. It most likely what happened.
@LBeernaertProton I can understand the bridge resyncs the entire state, but not sure why this makes the Geary setting to not sync mails older than a week to be synced. I'm on a phone with not so much disk space. proton-bridge cache is about 2GB, and normally Geary cache is about a few MB. But this made my geary cache to use roughly 2GB too
@primalmotion Unfortunately I also don't know what might be causing that issue with that specific email client.
We have patched this bug and we will release it as soon as we are able. I would recommend trying again once that release is out and seeing if the issue is resolved. I will post to the thread once it is ready.
(Not sure why the parsed
Date
header is 10s off, but that's an issue with Easy Switch which I am not discussing here.)
Just in case anyone wondered: for some reason Easy Switch is using the INTERNALDATE
timestamp here, likely because it failed to parse the original Date
header for some unknown reason. Not sure if and where issues with Easy Switch should be reported.
I would recommend trying again once that release is out and seeing if the issue is resolved
Will do :) thanks!
The fix for this issue has been pushed into the v3 branch.
DISCLAIMER: This is a mirror of our internal development branch and has not yet been fully tested and vetted.
An official release will be made as soon as we are ready.
@LBeernaertProton not to hijack this issue but, I have something incredibly similar - it looks like the ones I imported with the bridge a few years ago aren't getting the correct date (as in, they're getting the date they went into proton bridge (latest version));
I couldn't find it in the code but I assume the Date
field is being taken, for my imported emails that would be;
Date: Sat, 17 Nov 2018 23:33:24 UT
The issue here is two fold, either the original provider or the bridge used UT instead of UTC and it doesn't look like time.Parse()
handles that with RFC1123
layout (it also doesn't match the RFC1123Z
layout either. I'm going based off the input into the tests; https://github.com/ProtonMail/proton-bridge/blob/7be46a47402be596a8a1f8925c09ef26f9445fc3/pkg/message/build_framework_test.go#LL60C46-L60C59 )
As a result, time.Parse()
produces; 0001-01-01 00:00:00 +0000 UTC ( https://go.dev/play/p/Sj7aR-Qd9wa )
Would the upcoming change also fix this? or would this format still be problematic? Happy to provide further examples
Hey @andrew-s, it will really depend on how long ago those messages were imported.
I did a quick check and v3 has support for the UT
timezone notation. I think some of v2 may support that as well.
The only way to address this would be to re-import that specific message on a more recent version of Bridge
Thanks for the quick response @LBeernaertProton - these are emails from 2019 and for years previous - it's impossible to reimport them as those accounts are long gone (that they were imported from) and with an inbox of >350k messages any operation takes time.
Is it possible to handle this during sync in the bridge? I feel this would be a better solution for those earlier (v1 or so bridge imports) - happy to provide any implementation work if pointed in the right direction
@andrew-s I can't say for certain it depends on the what data was sent to the servers at that point. Can you provide all the date related headers (see list below) for one of these messages ?
I went through this one specifically and the headers are;
Date: Sat, 17 Nov 2018 23:33:24 UT
There are no X-Original-Date
or X-Pm-Date
headers - I assume the X-Pm-*
headers are added during import?
I'm a bit confused at the moment.
If these message were imported I would expect to see at least an X-Pm-Date
header. If there was a parsing issue with the message date, the original value is written to the X-Original-Date
header and Date
is replaced with a default value.
The main issue for this thread was that we were overriding existing X-Original-Date
values rather than keeping the existing value should there be one.
The code you pointed to is test code that has no bearing on our date parsing. That's to initialize our tests' message date.
Are you viewing the date for these messages in an IMAP client? As far as I can tell, this date is parsed correctly by our date parsers.
I'm guessing this was v1 of the bridge - it sounds like it was buggy, this is definitely an imported email as it's labeled Imported Sep-19-2020 06:34
and there are other headers;
X-Pm-Origin: import
X-Pm-Content-Encryption: on-import
X-Pm-Spamscore: 0
X-Pm-Spam-Action: dunno
X-Original-To: {redacted}
But no X-Pm-date
field, there were other errors during sync regarding invalid address fields so, I'll raise that one separately - I'm also viewing this in the web client -> 'Message Headers'
Edit: Interestingly, if I search for a period that should include this email - it doesn't appear in the web client, I have to search the title or another attribute so it seems like this is broken in the web search too (likely it didn't import correctly?)
@andrew-s I'm afraid that there is no other way to fix this other than to re-import the messages, we can't update the existing metadata attached to the message made during a previous import. Based on the date field you provided, it should display fine with the current Web and Bridge clients.
Regarding the error messages about failed to parse address, those addresses do not follow the the RFC5322 specification. You can ignore those.
@LBeernaertProton I've already mentioned that would be impossible, those inboxes are from almost 3 years ago - they're long gone. I'm confused as to why this can't be fixed in the bridge? when the bridge initially caused this problem (and it absolutely does not work correctly in web either as you can't filter by date on these messages).
@andrew-s There is no API on the backend for us to update the dates of existing messages, nor is there a way to do so via the IMAP protocol. We simply can't make these changes. You can try to contact costumer support and ask for assistance in that regard as this outside of the scope of things we can address in the bridge.
But to confirm. The date displays correctly on the web and the IMAP client? The only thing you can't seem to do on the web is search in that given interval?
Is this fix included in 3.0.21? I didn't see mention of GODT-2480 in the release notes...
I'm afraid this fix will be released in a future, hopefully next, update.
Further exploring the handling of date headers I see more weirdness happening. As an example a message that has the following Date
header on the original IMAP server:
date: Thu, 8 May 2003 00:27:30 METDST
and INTERNALDATE
is 08-May-2003 02:27:30 +0200
. After Easy Switch this becomes in the webclient:
X-Original-Date: Thu,
8 May 2003 00:27:30 METDST
Date: Thu,
8 May 2003 00:27:30 METDST
Easy Switch (correctly) identified that the original Date
header didn't follow the spec, copied it to the X-Original-Date
header but for some reason did not update the Date
header to a default value (assuming the webclient doesn't do its own parsing). The webclient doesn't give the X-Pm-Date
header. Now moving to the bridge (I updated to 3.0.21) this is what I get:
X-Pm-Date: Thu, 01 Jan 1970 00:00:00 +0000
X-Original-Date: Mon, 01 Jan 0001 00:00:00 +0000
Date: Fri, 13 Aug 1982 00:00:00 +0000
That's quite funny but I don't think it's what it should be.
A fix for this issue is present in version 3.1 which is now in pre-release.
OK the new version 3.1.3 solves the issue with X-Original-Date
.
The issue reported above where Date
is different between the webclient and Bridge still stands.
@jknockaert We had issues in the past with Apple Mail where if we did not present a valid date it could cause the client to crash.
@LBeernaertProton Alright then I guess we can close this one.
So I had a bunch of emails imported using Easy Switch, and then when I ran a python script using Bridge to compare the original to the imported messages I noticed some weird behaviour happening with the Date headers. So upon import some original headers are modified. The content of the original field should be saved in a new header with a
X-Original-
prefix. But then when using Bridge the corresponding header values returned are nonsense. As an example I use a message I received on 15 January 2023 at 4:23:03. In the original message the corresponding header field is:After Easy Switch this becomes in the web client:
(Not sure why the parsed
Date
header is 10s off, but that's an issue with Easy Switch which I am not discussing here.) Fetching this issue over Bridge the headers become:Not sure why
X-Pm-Date
was added in this context, but theX-Original-Date
header is definitely not quite right.Expected Behavior
Bridge should return exactly the same value for the
X-Original-Date
header as the web client.Current Behavior
Bridge returns
Mon, 01 Jan 0001 00:00:00 +0000
for theX-Original-Date
header for messages imported with Easy Switch.Steps to Reproduce
Am not exactly sure, but as far as I can see this happens to all imported messages.
Version Information
web client: 5.0.17.12 (on Firefox 110.0.1) bridge: 3.0.19 (on macOS 13.2.1)