nextcloud / mail

💌 Mail app for Nextcloud
https://apps.nextcloud.com/apps/mail
GNU Affero General Public License v3.0
848 stars 262 forks source link

Multipart / Attachment handling overview #8640

Open kesselb opened 1 year ago

kesselb commented 1 year ago

The purpose of this issue is to document known issues with our multipart and attachment handling.

Hidden attachments

Original report https://github.com/nextcloud/mail/issues/5282 and partly fixed by https://github.com/nextcloud/mail/pull/5339.

Sample eml ``` Return-Path: Delivered-To: jane@doe.local Received: from fcf416df7902 by fcf416df7902 with LMTP id oKwwBSNDuWSRAQAAuSYRyA (envelope-from ) for ; Thu, 20 Jul 2023 14:22:27 +0000 Received: from localhost (unknown [172.19.0.1]) by fcf416df7902 (Postfix) with ESMTP id 063732260B32 for ; Thu, 20 Jul 2023 14:22:27 +0000 (UTC) From: alice@test.local To: jane@doe.local Subject: Patches Message-ID: <20230720162227.Horde.tQbO75MfhHKIerc-38l3V5j@pc> User-Agent: Horde Application Framework 5 Date: Thu, 20 Jul 2023 16:22:27 +0200 Content-Type: multipart/mixed; boundary="=_ltOS0Zz8DtIKjXr3jbdMGDN" MIME-Version: 1.0 This message is in MIME format. --=_ltOS0Zz8DtIKjXr3jbdMGDN Content-Type: multipart/alternative; boundary="=_sVhlJCYnkclzOHfh6XZ-w5L" This message is in MIME format. --=_sVhlJCYnkclzOHfh6XZ-w5L Content-Type: text/html; charset=utf-8 Content-Description: HTML Version of Message Hello Hello --=_sVhlJCYnkclzOHfh6XZ-w5L Content-Type: text/plain; charset=utf-8 Content-Description: Plaintext Version of Message Hello Hello --=_sVhlJCYnkclzOHfh6XZ-w5L-- --=_ltOS0Zz8DtIKjXr3jbdMGDN Content-Type: text/x-patch; name=some.patch Content-Disposition: inline; filename=some.patch hello world --=_ltOS0Zz8DtIKjXr3jbdMGDN-- ``` Generated via: https://github.com/kesselb/weird-emails/blob/main/attachment_content_disposition_inline.php
Message structure ```mermaid graph TD; multipart/mixed-->multipart/alternative; multipart/alternative-->text/html; multipart/alternative-->text/plain; multipart/mixed-->text/x-patch; ```
Screenshots Mail ![Screenshot from 2023-07-20 18-09-07](https://github.com/nextcloud/mail/assets/3902676/ee3142de-59cd-4298-a933-faefa8c73f1e) Thunderbird ![Screenshot from 2023-07-20 18-08-39](https://github.com/nextcloud/mail/assets/3902676/8f3e096f-3ee6-4424-9021-0509af956540)

HTML message Content-Disposition for text/x-patch is inline

Issue: We assume that mime parts with content-disposition = inline are referenced in via content-id / cid in the html document. Is the mime part not referenced, it's hidden.

Possible solutions: 1) Merge inlineAttachments and attachments like for text/plain emails. 2) If content-disposition = inline && content-id = null, then treat mime part as regular attachment.

Hidden message

Sample eml ``` Return-Path: Delivered-To: jane@doe.local Received: from fcf416df7902 by fcf416df7902 with LMTP id lko6KsJTuWQUBAAAuSYRyA (envelope-from ) for ; Thu, 20 Jul 2023 15:33:22 +0000 Received: from localhost (unknown [172.19.0.1]) by fcf416df7902 (Postfix) with ESMTP id 9BEA52260B32 for ; Thu, 20 Jul 2023 15:33:22 +0000 (UTC) From: alice@test.local To: jane@doe.local Subject: Multipart with multiple parts Message-ID: <20230720173322.Horde.Lbu6mga1p0l6Yh4ZFm-Exh4@pc> User-Agent: Horde Application Framework 5 Date: Thu, 20 Jul 2023 17:33:22 +0200 Content-Type: multipart/mixed; boundary="=_Xs2dih_sEN07PRgH-bx7MWq" MIME-Version: 1.0 This message is in MIME format. --=_Xs2dih_sEN07PRgH-bx7MWq Content-Type: text/html; charset=utf-8 Content-Description: HTML Version of Message Hello Hello --=_Xs2dih_sEN07PRgH-bx7MWq Content-Type: text/html; charset=utf-8 Content-Description: HTML Version of Message Hope you are dooing fine --=_Xs2dih_sEN07PRgH-bx7MWq-- ``` Generated via: https://github.com/kesselb/weird-emails/blob/main/multipart_with_multiple_parts.php
Message structure ```mermaid graph TD; multipart/mixed-->text/html; multipart/mixed-->text/html; ```
Screenshots Mail ![Screenshot from 2023-07-20 18-33-16](https://github.com/nextcloud/mail/assets/3902676/fda00979-49f9-4729-aa93-fe910755fb84) Thunderbird ![Screenshot from 2023-07-20 18-33-24](https://github.com/nextcloud/mail/assets/3902676/80bf0c65-2af9-4a93-b744-efb0188d0856)

Multipart message with multiple text/html parts.

Issue: We show the first text/html part. Without looking at the source, you don't know there's something else.

Possible solutions: 1) Render both text/html parts 2) Add the mime parts as attachment

SebastianKrupinski commented 3 months ago

Some possible mime structures...

multipart/mixed
  text/plain, content-disposition=inline - A
  multipart/mixed
    multipart/alternative
      multipart/mixed
        text/plain, content-disposition=inline - B
        image/jpeg, content-disposition=inline - C
        text/plain, content-disposition=inline - D
      multipart/related
        text/html - E
        image/jpeg - F
    image/jpeg, content-disposition=attachment - G
    application/x-excel - H
    message/rfc822 - J
  text/plain, content-disposition=inline - K

In this case, the above algorithm would decompose this to:

textBody => [ A, B, C, D, K ]
htmlBody => [ A, E, K ]
attachments => [ C, F, G, H, J ]
ivarsg commented 2 months ago

Hi, here is a simplified sample of one more mime structure, that fails to display correctly in Nextcloud Mail. This is is a Domain-based Message Authentication, Reporting, and Conformance (DMARC) aggregate report sent by Google (some information removed from message). Basically, it is a MIME message whose Content-Type is application/zip (attachment only, no other parts/content):

p.s. Just found - this is also discussed in #4423 !

Return-Path: <noreply-dmarc-support@google.com>
Delivered-To: mailbox@example.com
Received: from mail.example.com
    by mail.example.com with LMTP
    id tZvOJQASw2ZcsA4ARJn5MQ
    (envelope-from <noreply-dmarc-support@google.com>)
    for <mailbox@example.com>; Mon, 19 Aug 2024 12:36:00 +0300
MIME-Version: 1.0
X-Received: by 2002:a05:6820:1acc:b0:5d5:d7fc:955c with SMTP id
 006d021491bc7-5da9801a875mr10422414eaf.5.1724060153323; Mon, 19 Aug 2024
 02:35:53 -0700 (PDT)
Date: Sun, 18 Aug 2024 16:59:59 -0700
Message-ID: <6356864336886002680@google.com>
Subject: Report domain: example.com Submitter: google.com Report-ID: 6356864336886002680
From: noreply-dmarc-support@google.com
To: postmaster@example.com
Content-Type: application/zip; 
    name="google.com!example.com!1723939200!1724025599.zip"
Content-Disposition: attachment; 
    filename="google.com!example.com!1723939200!1724025599.zip"
Content-Transfer-Encoding: base64

UEsDB... <actual-base64-content-as-well-as-unrelated-headers-removed> ...AAAA=