ProtonMail / proton-bridge

Proton Mail Bridge application
GNU General Public License v3.0
1.17k stars 157 forks source link

Message parsing issues in Import-Export app #89

Closed exander77 closed 4 years ago

exander77 commented 4 years ago

Expected Behavior

Migrate all emails without expectation and do not modify/break them in any way.

Current Behavior

Is unable to migrate emails or breaks them.

Possible Solution

Maybe fix some damn bugs or test it?

Steps to Reproduce

  1. Choose a mailbox.
  2. Migrate emails using Import-Export tool.
  3. Check the results.

Context (Environment)

My emails were either not migrated at all or broken.

Detailed Description

    140 "failed to import: failed to create draft: too many retries: Invalid sender"
      3 "failed to import: failed to import message: Error while parsing message"
      3 "failed to import: failed to parse message: malformed MIME header line: --9929ef3793b8ac4f7afe31260fa69e6a--"
      2 "failed to import: failed to parse message: malformed MIME header line: --be2b4fba2ec876e0d6baea9120d5eabde--"
      3 "failed to import: failed to parse message: malformed MIME header line: --bed3fc21cebe086ca423bad0f7ed139b0--"
      3 "failed to import: failed to parse message: malformed MIME header line: Děkujeme za Váš zájem o produkty z internetového obchodu HyperMotoShop.cz. Vaše objednávka byla přijata a bude v nejbližší době vyřízena."
    249 "failed to import: failed to parse message: mime: invalid media parameter"
     55 "failed to import: failed to parse message: multipart: NextPart: EOF"
      1 "failed to import: failed to parse message: multipart: unexpected line in Next(): \".2011&utm_content=3DElektro&utm_campaign=3DListopad\\\"><img bo=\\r\\n\""
      2 "failed to import: failed to parse message: multipart: unexpected line in Next(): \"ansparent; text-decoration: =\\r\\n\""
      2 "failed to import: failed to parse message: multipart: unexpected line in Next(): \"idka&utm_campa=\\r\\n\""
      3 "failed to import: failed to parse message: multipart: unexpected line in Next(): \"-----Mail-Builder-Boundary-2796-1575026645-1\\r\\n\""

Possible Implementation

Do not fuck with email contents.

jameshoulahan commented 4 years ago

Hi, thanks for the report.

Regarding Context, could you please specify some more information including:

Regarding Detailed Description, are you able to construct messages that allow you to reproduce each of the above errors?

exander77 commented 4 years ago

Neither the version nor operating system matters. These are the same bugs I encountered using ProtonBridge half a year ago. Nothing changed. Hence my complete dissatisfaction. I reported here, on support, no change.

But the system is Ubuntu Linux and Linux version 1.0.0 (latest as of today).

exander77 commented 4 years ago

@jameshoulahan I already sent a pack of massages to Bridge team in April (April 23rd, 2020) and broken S/MIME messages later on (April 27th, 2020). These were about Bridge, but exactly the same messages fail in Import-Export tool. Nobody is fixing anything.

jameshoulahan commented 4 years ago

I rewrote the message parser, partially based on your feedback. The new parser will be used in the upcoming version of the Import-Export app and in Bridge. If you check the commits in this repository, you'll see that things are indeed changing.

If you wish, you can try a beta version of the app with the new parser from here: https://protonmail.com/download/beta/ie/protonmail-import-export-app_1.1.1-1_amd64.deb

There are a few known quirks with the new parser which will be fixed in an upcoming beta deployment of v1.1.2, but I would appreciate any further feedback you can give me. As you said back in April (?), parsing is difficult, and I agree.

exander77 commented 4 years ago

One thing I can report is that neither Bridge nor Import-Export tool is losing an attachment from an email which leads me to believe that the bug was discovered and covered up as I have not received an apology email: Dear customers, there was a bug in ProtonBridge which caused a loss of an attachment.

exander77 commented 4 years ago

@jameshoulahan I would like to try it, but I would like to request more space on my account, having three copies of emails simultaneously pushed me to 10GB yesterday and I am not ready to remove them yet and start over, I would do Import-Export to another address.

jameshoulahan commented 4 years ago

For such requests, you'll have to contact customer support.

exander77 commented 4 years ago

@jameshoulahan I did already, but support...

exander77 commented 4 years ago

@jameshoulahan They will respond in like a week if they even do.

exander77 commented 4 years ago

Wrote them on October 2nd, 2020. This was when I did first run with the new tool. No response yet.

exander77 commented 4 years ago

@jameshoulahan Btw, I didn't notice my every email were missing an attachment until I did a comparison last night, so it went completely unnoticed for half a year. It is really hard to spot errors in large data sets without being able to in-depth analysis of them. I had to write several tools in Python to be able to easily compare messages are migrated. I am now working on a comparison of actual parts of the message. I wonder what I will find there.

jameshoulahan commented 4 years ago

Let's keep this issue focussed on one thing, namely the parser issues that you described in detailed description. For attachment issues, you already have #86 .

Let's postpone discussion about specific parser issues until you have tried an import using the rewritten parser. Virtually none of the old code is there anymore; as such, it will only lead to confusion if we begin deep-diving into issues caused by old code.

Thanks for your continued patience and for helping us to improve our product.

exander77 commented 4 years ago

@jameshoulahan When will be v1.1.2 available? If it is in a few days I may wait for it, if it is longer, I will try it ASAP if Support grants me some additional space.

jameshoulahan commented 4 years ago

All things going according to plan, we intend to deploy tomorrow on tuesday. But I'm sure you can appreciate that I can't give you any concrete guarantees.

exander77 commented 4 years ago

I wrote another mail to support, I would really like to try the new version. But I would need more space.

Funny thing, I had like 4,9GB used on my account and now I have 6,9GB after redoing migration using Import-Export tool. The migration in April lost like 2GB of attachments. I am sure all mails were there. So it was the attachments.

exander77 commented 4 years ago

Why is everything with ProtonMail so frustrating? Support responding after 5 days, me responding after 5 minutes and another day passed... I could make like 10 different accounts for testing each with 500MB, but I can't use ProtonBridge with them. And I would be probably banned based on Reddit complaints.

exander77 commented 4 years ago

@jameshoulahan So, I found a lot of problems with the previous batch and decided to delete them, so I am running v1.1.2 now. So far it seems sad, I am 11% in and I already have 165 failures.

exander77 commented 4 years ago

@jameshoulahan So, 23% in and we are at 551 failures. So over the original...

exander77 commented 4 years ago

Also, when deleting emails from my previous migration, it seems that my ThunderBird account was wrecked again. I selected all emails, pressed delete. ThunderBird says all emails are gone, but they are not gone on the server, and the number of emails is not changing. So probably an error somewhere.

exander77 commented 4 years ago

So, 39%, 1100 failures.

exander77 commented 4 years ago

Results with v1.1.2:

     66 "failed to import: failed to create draft: too many retries: Invalid sender"
      3 "failed to import: failed to import message: Error while parsing message"
      3 "failed to import: failed to parse message: failed to collect attachments: can not get encoding for '' (or '')"
     14 "failed to import: failed to parse message: failed to collect attachments: mime: invalid media parameter"
     29 "failed to import: failed to parse message: failed to create new parser: encoding error: unhandled encoding \"8t\""
      7 "failed to import: failed to parse message: failed to create new parser: illegal base64 data at input byte 0"
      3 "failed to import: failed to parse message: failed to create new parser: illegal base64 data at input byte 416"
      2 "failed to import: failed to parse message: failed to create new parser: illegal base64 data at input byte 448"
      4 "failed to import: failed to parse message: failed to create new parser: illegal base64 data at input byte 5"
      3 "failed to import: failed to parse message: failed to create new parser: message: malformed MIME header line: --9929ef3793b8ac4f7afe31260fa69e6a--\r\n"
      2 "failed to import: failed to parse message: failed to create new parser: message: malformed MIME header line: --be2b4fba2ec876e0d6baea9120d5eabde--\r\n"
      3 "failed to import: failed to parse message: failed to create new parser: message: malformed MIME header line: --bed3fc21cebe086ca423bad0f7ed139b0--\r\n"
      3 "failed to import: failed to parse message: failed to create new parser: message: malformed MIME header line: Děkujeme za Váš zájem o produkty z internetového obchodu HyperMotoShop.cz. Vaše objednávka byla přijata a bude v nejbližší době vyřízena.\r\n"
    230 "failed to import: failed to parse message: failed to create new parser: multipart: boundary is empty"
      3 "failed to import: failed to parse message: failed to create new parser: multipart: NextPart: EOF"
      2 "failed to import: failed to parse message: failed to create new parser: quotedprintable: invalid bytes after =: \"\""
      5 "failed to import: failed to parse message: failed to create new parser: quotedprintable: invalid unescaped byte 0x11 in body"
     62 "failed to import: failed to parse message: failed to create new parser: unexpected EOF"
      4 "failed to import: failed to parse message: failed to parse message header: can not get encoding for '' (or '')"
      8 "failed to import: failed to parse message: failed to parse message header: mail: expected comma"
   1745 "failed to import: failed to parse message: failed to parse message header: mail: header could not be parsed"
      3 "failed to import: failed to parse message: failed to parse message header: mail: invalid string"
    283 "failed to import: failed to parse message: failed to parse message header: mail: missing @ in addr-spec"
      6 "failed to import: failed to parse message: failed to parse message header: mail: missing '@' or angle-addr"
    136 "failed to import: failed to parse message: failed to parse message header: mail: no address"
      4 "failed to import: failed to parse message: failed to parse message header: mail: no angle-addr"
      3 "failed to import: failed to parse message: failed to parse message header: mail: trailing dot in atom"
      2 "failed to import: failed to parse message: failed to parse message header: mail: unclosed angle-addr"

Just glancing over it... it has the same error types as before and several new ones.

exander77 commented 4 years ago

After this import, I also have a ton of messages in my folder which does not return RFC822 content. It seems like some messages were imported and there are records in the ProtonMail database for them, but the content is not actually there.

exander77 commented 4 years ago

After this import, I also have a ton of messages in my folder which does not return RFC822 content. It seems like some messages were imported and there are records in the ProtonMail database for them, but the content is not actually there.

This resolved itself, so probably a server issue?

edwinlai commented 4 years ago

I'm also experiencing similar parsing issues. Specifically, I've noticed this happens consistently for emails with .docx attachments.

"Error":"failed to import: failed to parse message: failed to collect attachments: mime: unexpected content after media subtype"

horejsek commented 4 years ago

What will help us the most is if you can find the messages causing the problems and send us raw examples, if possible, to bridge@protonmail.ch. Ideally examples of messages causing the most happening issue(s). Because this is not good to share openly on GitHub, I'm closing this issue. Please, contact our customer support.

exander77 commented 4 years ago

@horejsek Reopen the issue! I already sent you a batch of messages in April and the same messages are still causing errors. There are new messages causing problems and I will prepare a batch with them. But the issue is there and it is not fixed in any way. It is worse.

jameshoulahan commented 4 years ago
Comparing old and new results, we see that the following errors appear in both the old and the new parser. There is a definite improvement there though (assuming total number of messages imported is the same). Old New
140 "Invalid sender" 66 "Invalid sender"
3 "Error while parsing message" 3 "Error while parsing message"
11 "malformed MIME header line 11 "malformed MIME header line
249 "mime: invalid media parameter" 14 "mime: invalid media parameter"
55 "NextPart: EOF" 3 "NextPart: EOF"

However, it's obvious there are quite a few issues affecting only the new parser. The most significant appear to be the following:

While developing the new parser, the focus was on correctness; we no longer reorder headers or silently strip/drop things, and as I mentioned in #85, the new parser currently only imports things it can parse "confidently". Malformed data, in most cases, leads to an error. In our test accounts, the new parser performed very well; there were little to no messages that could not be imported. However, it seems you have quite a few "special" messages.

As @horejsek said, we would greatly appreciate a sample of the messages that are causing problems for the new parser. This would help us determine whether the messages are okay and the errors are caused by bugs in the new parser itself, or whether the messages are indeed malformed in some way, which would encourage us to bump priority of better handling of "non-critical" parsing errors.

Also @exander77, you mentioned you sent us a batch of messages in April. Just letting you know, we did receive them and we incorporated them into our testing suite. A number of those errors were fixed but some still remain due to various issues in the MIME structure of those messages.

exander77 commented 4 years ago

@jameshoulahan I would like to prepare another batch. But it is hard with Import-Export tool because it does not log failed messages and does not even log Message-Id. The tool is pretty user-hostile in that regard. As it is I would have to do it by hand to pick message by subject, but there are many messages with the same subject. Or Am I missing something?

Then I modified imapsync, I logged each failed message with the Message-Id and error in the name of the file.

I can say right away that the nonstandard dates are quite common. All messages were migrated from Gmail so I doubt there are broken addresses there.

I just looked into one case I think it is this:

To: "<Some name>" <someaddress@somewhere.com>

I think you are mishandling special characters in the name in this case < and >. This was sent from a completely normal email client. I looked into some other messages and it this would be the most likely cause. Also, some clients use "<Beze jména>", "" for the address which does not have a contact filled.

jameshoulahan commented 4 years ago

The parser correctly handles messages with To: "<Some name>" <someaddress@somewhere.com>; the < and > are not the cause of the problems.

Also, some clients use "<Beze jména>", "" for the address which does not have a contact filled.

This could perhaps be the problem. I look forward to your batch of messages.

Regarding the difficulties identifying exactly which messages are causing issues, this is something we can look into improving in an upcoming app design iteration.

exander77 commented 4 years ago

The parser correctly handles messages with To: "<Some name>" <someaddress@somewhere.com>; the < and > are not the cause of the problems.

Also, some clients use "<Beze jména>", "" for the address which does not have a contact filled.

This could perhaps be the problem. I look forward to your batch of messages.

Regarding the difficulties identifying exactly which messages are causing issues, this is something we can look into improving in an upcoming app design iteration.

It is like this "<Beze jména>" followed by email or "" followed by email (not to make confusion).

It would be best if you could provide me with an Import-Export tool version with extended logging:

With this, I can provide you with very good input for debugging. (I already wrote a tool for myself to dump messages by Message-Id).

exander77 commented 4 years ago

@jameshoulahan Are there sources for Import-Export tool? I would hack some additional logging for me there in the meantime.

jameshoulahan commented 4 years ago

The sources for the Import-Export app are in this repository. You can build it with make build-ie or run a shell with make run-ie. Please note that go1.14/go1.15 suffers from a bug in the mail.ParseDate function (see here) so if you build it yourself, use go1.13.

Here is a patch you can apply to pkg/message/parser.go that dumps any raw messages that couldn't be imported to ~/.ie-errors/<subject><randomInt>.eml.

diff --git a/pkg/message/parser.go b/pkg/message/parser.go
index aeb9615..f00d359 100644
--- a/pkg/message/parser.go
+++ b/pkg/message/parser.go
@@ -21,9 +21,12 @@ import (
        "bytes"
        "fmt"
        "io"
+       "io/ioutil"
        "mime"
        "net/mail"
        "net/textproto"
+       "os"
+       "path/filepath"
        "strings"

        "github.com/ProtonMail/proton-bridge/pkg/message/parser"
@@ -38,7 +41,50 @@ import (
 func Parse(r io.Reader, key, keyName string) (m *pmapi.Message, mimeBody, plainBody string, attReaders []io.Reader, err error) {
        logrus.Trace("Parsing message")

-       p, err := parser.New(r)
+       b, err := ioutil.ReadAll(r)
+       if err != nil {
+               return
+       }
+
+       defer func() {
+               if err == nil {
+                       return
+               }
+
+               parseErr := err
+
+               homeDir, err := os.UserHomeDir()
+               if err != nil {
+                       return
+               }
+
+               errorsDir := filepath.Join(homeDir, ".ie-errors")
+
+               if err := os.MkdirAll(errorsDir, 0700); err != nil {
+                       return
+               }
+
+               dumpName := "unknown-subject"
+               if m != nil && m.Subject != "" {
+                       dumpName = m.Subject
+               }
+
+               f, err := ioutil.TempFile(errorsDir, dumpName)
+               if err != nil {
+                       return
+               }
+
+               if _, err := f.Write(b); err != nil {
+                       return
+               }
+
+               logrus.
+                       WithField("path", f.Name()).
+                       WithField("parseErr", parseErr).
+                       Error("Failed to parse message, dumped original")
+       }()
+
+       p, err := parser.New(bytes.NewReader(b))
        if err != nil {
                err = errors.Wrap(err, "failed to create new parser")
                return

Feel free to add any additional logging you may find useful.

Edit: seems copying/pasting the diff messed up spaces/tabs. Here's it as an attachment: patch.txt

exander77 commented 4 years ago

@jameshoulahan One more question, can I hack it easily not to put messages on the server at all? Just do the parsing and then do nothing?

I could retest the messages really quickly this way. I dumped them in the mbox during the weekend, so I could retest my whole mailbox locally that way.

jameshoulahan commented 4 years ago

Yes, the easiest way is to just hack the method which uploads messages to the server like so: diff.txt

exander77 commented 4 years ago

@jameshoulahan Well, that is a significantly faster way how to test it!

exander77 commented 4 years ago

@jameshoulahan Where is this message coming from "header could not be parsed"? I don't see it in sources.

jameshoulahan commented 4 years ago

It's not from bridge. I've only ever seen it come from mail/message.go, line 130. It's the result of a bad/nonstandard date. According to this, it's the only occurrence of this error string in the entire golang source.

exander77 commented 4 years ago

@jameshoulahan Oh, I see, you mentioned it before. So it should be easily handled by removing the trailing differences for Date parsing. I would say that based on the numbers above this is the most pressing issue as almost 2/3 of errors are because of this.

exander77 commented 4 years ago

Some strange things I see just by glancing over it:

From: <somebody@somwhere.com >
To: somebody@somebody.com, ; This is quite common.
To: somebody
To: <some random ascii text with spaces>
To: <Beze jména> <somebody@somebody.com> ; This is extremely common as well. Some clients seem to does this as I have it on all emails with certain people.
To: =?UTF-8?B?PEJlemUgam3DqW5hPg==?= <somebody@somebody.com> ; This is encoded <Beze jména>. Same as above.
From: somebody@somebody.com:81 ; Probably port parsing problem?
Content-Transfer-Encoding: x-custom-encoding ; Custom encoding

Based on the RFC822, I would say that all of this is ok. There can be pretty much anything in the "From", "To" etc. fields.

This is some kind of strange header sequence from one Israeli company:

MIME-Version: 1.0
Content-Transfer-Encoding: 8t
MIME-Version: 1.0
Content-Transfer-Encoding: 8bit

I am not really sure if this is RFC compliant per se as custom encodings should be prefixed with x-, but as nothing is actually encoded by it as it is overridden by the next "Content-Transfer-Encoding"... I would suggest handling it in the same way as any custom encoding.

exander77 commented 4 years ago

Another finds:

To: undisclosed-recipients:
From: Name somebody@somewhere.com
To: somebody@somewhere.com, ; Trailing comma appears in To as well and is very common.
Reply-To: somebody@somewhere.com, ; Trailing comma appears in Reply-To as well and is very common.
To: <postmaster@[10.10.10.10]>
From: "GES-ELECTRONICS <GES-ELECTRONICS"@nos.ges.cz, a.s. <ges@ges.cz>>
exander77 commented 4 years ago

Where is "invalid media parameter" coming from?

exander77 commented 4 years ago

I have also glanced over the unparsable dates (I tried to find different cases):

2 Apr 2014 10:27:40
6 Oct 2016 13:48:22
Fri, 10 Aug 2018 11:09:34 -0700 (GMT-07:00)
Fri, 13 Apr 2012 11:01:05 UT
Fri, 13 Dec 2019 10:43:36 +0000 (GMT+00:00)
Fri, 14 Dec 2018 09:19:48 -0700 (GMT-07:00)
Fri, 17 May 2019 10:46:35 +0000 (GMT+00:00)
Fri, 19 Apr 2013 14:18:25 +0200 (METDST)
Fri, 28 May 2010 10:43:37 (GMT)
Mon, 10 Dec 2018 05:35:52 -0700 (GMT-07:00)
Mon, 10 Dec 2018 22:33:45 -0700 (GMT-07:00)
Mon, 10 Nov 2014 07:30:25 +0100 (Støední Evropa (bìžný èas))
Mon, 12 May 2014 21:06:45
Sat, 2 Oct 10 19:59:17 GMT-0700
Tuesday 02 March 2010 04:02:33 PM
Wed, 02 Jun 2010 08:22:33 (GMT)
Wed, 5 Mar 2014 05:27:02 +0100 +0100
Wed, 5 May 10 05:59:05 GMT-0700
jameshoulahan commented 4 years ago

Many thanks for the valuable feedback! I hope to find some time soon to take a closer look at all the examples you've provided.

"invalid media parameter" most likely comes from line 173 of mime/mediatype.go, indicating that something went wrong while parsing the media type. It can be caused by various issues. One known example is when there are non-ascii chars in an attachment name.

exander77 commented 4 years ago

@jameshoulahan I think this way it will be much efficient, I will send you just the specific headers that cause problems. And I can easily rerun if you make improvements to see what changed. I improved error messages on some places, to wrap error with header name etc. I may send a pull request. Having a header name which causes problems in the log is extremely useful.

exander77 commented 4 years ago

@jameshoulahan After we improve this on my personal emails, I can run it on over around 300GB of emails I have at hand. When I have it patched like this (not actually storing the emails), I can run it on a lot of samples and it is even quite fast.

exander77 commented 4 years ago

New:

Cc: =?ISO-8859-2?Q?Somebody_Somewhere?= <somebody@somewhere.com>, <somebody@somewhere.com,Somebody/AAA/BBB/CCC,>
To: somebody%somewhere...com
From: "Mailer Daemon" <>
From: =?windows-1250?Q?Spr=E1vce_syst=E9mu?=
From: Správce systému ; Decoded line above, it is a message sent from Windows.
To: "'somebody@somewhere.com.'"
To: Somebody Somewhere <somebody@somewhere. com> <somebody@somewhere.com> ; "Somebody Somewhere <somebody@somewhere. com>" was actually encoded together.
To: "somebody@somewhere.com." <somebody@somewhere.com.>
exander77 commented 4 years ago

@jameshoulahan

Regarding the: failed to create new parser: multipart: boundary is empty Seems like this is some parse error and incorrect error reporting, the boundary is not empty:

Content-Type: multipart/alternative;
        boundary=----NODEMAILER-?=_19-1330086946289

Could be related to boundary starting with four ----?

Messages with the error actually seem completely normal.

It basically means all messages produced by NODEMAILER are unimportable.

exander77 commented 4 years ago

Regarding the: failed to create new parser: multipart: expecting a new Part; got line "--001a113dc39a1c56d0053a206f3e--\n" and similar. Parser has a problem when boundary does not end with: --001a113dc39a1c56d0053a206f3e--\r\n" but with: --001a113dc39a1c56d0053a206f3e--\n"

This by all means a correct message.

exander77 commented 4 years ago

Regarding the: failed to create new parser: unexpected EOF It seems to be caused by the body or part ending with space/spaces. Wtf? The message is completely misleading as removing space solves the problem. I would assume that ending part with space is completely valid.

exander77 commented 4 years ago

I have also glanced over the unparsable dates (I tried to find different cases):

I have been through RFC5332 and it all seems compliant (except rare cases):

   date-time       =   [ day-of-week "," ] date time [CFWS]
   day-of-week     =   ([FWS] day-name) / obs-day-of-week
   day-name        =   "Mon" / "Tue" / "Wed" / "Thu" /
                       "Fri" / "Sat" / "Sun"
   date            =   day month year
   day             =   ([FWS] 1*2DIGIT FWS) / obs-day
   month           =   "Jan" / "Feb" / "Mar" / "Apr" /
                       "May" / "Jun" / "Jul" / "Aug" /
                       "Sep" / "Oct" / "Nov" / "Dec"
   year            =   (FWS 4*DIGIT FWS) / obs-year
   time            =   time-of-day zone
   time-of-day     =   hour ":" minute [ ":" second ]
   hour            =   2DIGIT / obs-hour
   minute          =   2DIGIT / obs-minute
   second          =   2DIGIT / obs-second
   zone            =   (FWS ( "+" / "-" ) 4DIGIT) / obs-zone

   obs-day-of-week =   [CFWS] day-name [CFWS]
   obs-day         =   [CFWS] 1*2DIGIT [CFWS]
   obs-year        =   [CFWS] 2*DIGIT [CFWS]
   obs-hour        =   [CFWS] 2DIGIT [CFWS]
   obs-minute      =   [CFWS] 2DIGIT [CFWS]
   obs-second      =   [CFWS] 2DIGIT [CFWS]
   obs-zone        =   "UT" / "GMT" /     ; Universal Time
                                          ; North American UT
                                          ; offsets
                       "EST" / "EDT" /    ; Eastern:  - 5/ - 4
                       "CST" / "CDT" /    ; Central:  - 6/ - 5
                       "MST" / "MDT" /    ; Mountain: - 7/ - 6
                       "PST" / "PDT" /    ; Pacific:  - 8/ - 7
                                          ;
                       %d65-73 /          ; Military zones - "A"
                       %d75-90 /          ; through "I" and "K"
                       %d97-105 /         ; through "Z", both
                       %d107-122          ; upper and lower case

   FWS             =   ([*WSP CRLF] 1*WSP) /  obs-FWS
                                          ; Folding white space
   ctext           =   %d33-39 /          ; Printable US-ASCII
                       %d42-91 /          ;  characters not including
                       %d93-126 /         ;  "(", ")", or "\"
                       obs-ctext
   ccontent        =   ctext / quoted-pair / comment
   comment         =   "(" *([FWS] ccontent) [FWS] ")"
   CFWS            =   (1*([FWS] comment) [FWS]) / FWS

Other multi-character (usually between 3 and 5) alphabetic time zones have been used in Internet messages. Any such time zone whose meaning is not known SHOULD be considered equivalent to "-0000" unless there is out-of-band information confirming their meaning.

Comments are allowed at the end and that pretty much covers this:

Fri, 10 Aug 2018 11:09:34 -0700 (GMT-07:00)
Fri, 13 Dec 2019 10:43:36 +0000 (GMT+00:00)
Fri, 14 Dec 2018 09:19:48 -0700 (GMT-07:00)
Fri, 17 May 2019 10:46:35 +0000 (GMT+00:00)
Mon, 10 Dec 2018 05:35:52 -0700 (GMT-07:00)
Mon, 10 Dec 2018 22:33:45 -0700 (GMT-07:00)
Fri, 19 Apr 2013 14:18:25 +0200 (METDST)
Mon, 10 Nov 2014 07:30:25 +0100 (Støední Evropa (bì�ný èas))

This is covered by obsolete timezone (and comment):

Mon, 12 May 2014 21:06:45
Fri, 13 Apr 2012 11:01:05 UT
Fri, 28 May 2010 10:43:37 (GMT)
Wed, 02 Jun 2010 08:22:33 (GMT)
Sat, 2 Oct 10 19:59:17 GMT-0700
Wed, 5 May 10 05:59:05 GMT-0700

This is covered by missing day of the week and obsolete timezone:

2 Apr 2014 10:27:40
6 Oct 2016 13:48:22

This is covered by obsolete timezones, unknown timezone:

Tuesday 02 March 2010 04:02:33 PM

I actually found only one non-compliant date:

Wed, 5 Mar 2014 05:27:02 +0100 +0100

So it is pretty much a bug in parsing library. Probably somebody did not read the RFC carefully.

I would note here that for example, this is an RFC5332 compliant date:

(What)2            Apr 2014 10      (the) :    27:  40 HELLOWORLD (is wrong with this?)