mnako / letters

Letters, or how to parse emails in Go
MIT License
46 stars 9 forks source link

Feature request: Ignore malformed headers #47

Closed chuckwagoncomputing closed 1 year ago

chuckwagoncomputing commented 1 year ago

One sender is sending emails with headers such as: X-Script/function: OrderConfirmationEmailService and X-Script/function: /Object/UserEmail.cfc/sendUserEmail

This results in errors: letters.ParseEmail: cannot read message: malformed MIME header line: X-Script/function: OrderConfirmationEmailService

It would be nice if we could have the option to ignore malformed headers that aren't critical to extracting data from the email.

chuckwagoncomputing commented 1 year ago

Actually, if I read the standard correctly, this might not be malformed at all. https://www.rfc-editor.org/rfc/rfc5322#section-2.2

Header fields are lines beginning with a field name, followed by a colon (":"), followed by a field body, and terminated by CRLF. A field name MUST be composed of printable US-ASCII characters (i.e., characters that have values between 33 and 126, inclusive), except colon.

mnako commented 1 year ago

@chuckwagoncomputing, thank you for taking the time to report this issue.

I tried reproducing it, but I was able to use Letters version 0.2.0 to correctly parse the headers:

X-Script/function: OrderConfirmationEmailService
X-Script/function: /Object/UserEmail.cfc/sendUserEmail

and got e.Headers.ExtraHeaders:

map[string][]string{
    "X-Script/function": []string{
        "OrderConfirmationEmailService", 
        "/Object/UserEmail.cfc/sendUserEmail",
    },
}

I used this report as an opportunity to achieve better test coverage for custom headers in https://github.com/mnako/letters/pull/48.

I am happy to continue investigating. Could you provide more information? Which version are you using? Can you share the raw email that cannot be parsed? If so, please remember to remove all private information before sharing.

chuckwagoncomputing commented 1 year ago

That's odd. I threw together a test program, and it produces the same output for the full email and for a file with only a X-Script/function header. I am indeed using v0.2.0.

Output

letters.ParseEmail: cannot read message: malformed MIME header line: X-Script/function: OrderConfirmationEmailService
map[]

Program

package main

import (
        "fmt"
        "os"
        "github.com/mnako/letters"
)

func main() {
        r, err := os.Open(os.Args[1])
        if err != nil {
                fmt.Println(err)
                return
        }
        email, err := letters.ParseEmail(r)
        if err != nil {
                fmt.Println(err)
                fmt.Println(email.Headers.ExtraHeaders)
        } else {
                fmt.Println("No error")
        }
}

Perhaps noteworthy is that I am running this on Windows. The behavior is the same both when launched from Windows and from a MSYS2 shell, and when passed a file with either DOS or Unix line endings.

chuckwagoncomputing commented 1 year ago

Also, removing the / does allow the email to be parsed correctly.

chuckwagoncomputing commented 1 year ago

OK: !#$%&'*+.^_`|~

Not OK: "(),/;<=>?@[]\{}

This is suspiciously close to the characters allowed in fields without quoting, from RFC5322, but with the following exceptions:

Not OK but OK in fields /={} OK but not OK in fields: `

chuckwagoncomputing commented 1 year ago

It looks like the culprit is validHeaderFieldByte in net/textproto which is called by net/mail

chuckwagoncomputing commented 1 year ago

Looks like this was fixed in 1.21 and I'm still on 1.20.