jakartaee / mail-api

Jakarta Mail Specification project
https://jakartaee.github.io/mail-api
Other
240 stars 100 forks source link

ParameterList fails to parse filename from Content-Disposition header encoded in UTF-8 with Q encoding #687

Open Dr4K4n opened 1 year ago

Dr4K4n commented 1 year ago

Describe the bug We are using the mail-api to parse incoming emails (MimeMessages), we received a particular email with a PDF attachment. The filename of this attachment is encoded in the Content-Disposition header in a "weird" way. This leads to the following exception

Stacktrace:

 Caused by: jakarta.mail.internet.ParseException: In parameter list <;
   filename==?utf-8?Q?XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX--111111111-XXXXXXXXXXXXXXXXXXX?=
   =?utf-8?Q?XXXXXXXXXXXXXXXXXXX=2Epdf?=;
   filename*0*=utf-8''XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX--111111111-XXXXXXXXXXX;
   filename*1*=XXXXXXXXXXXXXXXXXXXXXXXXXXX.pdf>, expected parameter value, got "="
     at jakarta.mail.internet.ParameterList.<init>(ParameterList.java:273)
     at jakarta.mail.internet.ContentDisposition.<init>(ContentDisposition.java:86)
     at jakarta.mail.internet.MimeBodyPart.getDisposition(MimeBodyPart.java:1239)
     at jakarta.mail.internet.MimeBodyPart.getDisposition(MimeBodyPart.java:327)

After googling I found that the header is encoded in "Q encoding" (https://en.wikipedia.org/wiki/MIME#Difference_between_Q-encoding_and_quoted-printable) It is also mentioned in RFC2047 (https://www.ietf.org/rfc/rfc2047.txt). The method jakarta.mail.internet.MimeUtility.decodeText(String etext) is actually able to parse such strings.

To Reproduce See test cases in attached pull request #688

Expected behavior See test cases in attached pull request #688

Envorinment:

lukasj commented 6 months ago

According to RFC 2231 which updates 2047 and explicitly allows using encoding definition in the Content-Disposition header, it is required to inform the client about encoding in parameter value being used through the usage of the * in the parameter name (see the definition od extended-parameter/extended-initial-name in the grammar), so in this particular case, the header should contain:

filename*==?utf-8?Q?XX...

Also would it be possible to share full header definition for which it fails for you? Thanks