sympa-community / sympa

Sympa, Mailing List Management Software
https://www.sympa.community/sympa
GNU General Public License v2.0
237 stars 94 forks source link

Sympa MIME boundary lines can violate 70-character limit in RFC 2046 #1795

Open adam12b1 opened 5 months ago

adam12b1 commented 5 months ago

Version

Sympa 6.2.72

Installation method

From ports on FreeBSD 13.2

Expected behavior

Sympa messages should be compliant.

Actual behavior

In a virtual robot with a long name, the Sympa-generated MIME boundary header on multipart messages can easily exceed 70 characters. That is the limit in the RFC, and if your email is processed through MailScanner, it gets blocked/quarantined/stripped with a complaint about a "Eudora long-mime-boundary attack" - but that is just one easy-to-see symptom. It occurs with MIME digests, or with moderation messages, or any other Sympa multipart MIME messages.

The heart of the problem is that these MIME boundary headers become non-compliant when they exceed 70 characters, and they should be truncated.

Steps to reproduce

Create a virtual domain with a name like "lists.a-very-long-domain-name.org", and a list within that domain, then try to submit a message for moderation or generate a MIME digest for the list, and look at the Content-Type: multipart/mixed; boundary="---------=1_<sympa.17... header. Or if you run MailScanner, you'll never see that header, you'll just see all content stripped and replaced with a warning.

Additional information

adam12b1 commented 5 months ago

And here is one way to fix the problem, simply truncating the $domain variable at 20 characters when it is inserted into the Message-ID, which is then used for the MIME boundary:

diff --git a/Sympa.pm b/Sympa.pm
index 24ee44b..d99773e 100644
--- a/Sympa.pm
+++ b/Sympa.pm
@@ -775,7 +775,7 @@ sub unique_message_id {
           (ref $that eq 'Sympa::List') ? $that->{'domain'}
         : ($that and $that ne '*') ? $that
         :                            $Conf::Conf{'domain'};
-    return sprintf '<sympa.%d.%d.%d.%d@%s>', $time, $usec, $PID,
+    return sprintf '<sympa.%d.%d.%d.%d@%.20s>', $time, $usec, $PID,
         (int rand 999), $domain;
 }

Any reason not to do that?

adam12b1 commented 5 months ago

...or perhaps this is better, not touching the Message-ID, only the MIME boundary:

diff --git a/Sympa/Message/Template.pm b/Sympa/Message/Template.pm
index a9ca7b6..44a450a 100644
--- a/Sympa/Message/Template.pm
+++ b/Sympa/Message/Template.pm
@@ -187,11 +187,11 @@ sub new {
         }
     }
     my $unique_id = Sympa::unique_message_id($robot_id);
-    $data->{'boundary'} = sprintf '----------=_%s', $unique_id
+    $data->{'boundary'} = sprintf '----------=_%0.55s', $unique_id
         unless $data->{'boundary'};
-    $data->{'boundary1'} = sprintf '---------=1_%s', $unique_id
+    $data->{'boundary1'} = sprintf '---------=1_%0.55s', $unique_id
         unless $data->{'boundary1'};
-    $data->{'boundary2'} = sprintf '---------=2_%s', $unique_id
+    $data->{'boundary2'} = sprintf '---------=2_%0.55s', $unique_id
         unless $data->{'boundary2'};

     my $self = $class->_new_from_template($that, $tpl . '.tt2',
ikedas commented 5 months ago

I think

sprintf '<sympa.%d.%d.%d.%d@%.20s>', $time, $usec, $PID,
         (int rand 999), $domain;

would be better to be

sprintf '<sympa.%d.%d.%d.%d@%s>', $time, $usec, $PID,
         (int rand 999), substr $domain, -20;

so that uniqueness across domains will be ensured as much as possible.

adam12b1 commented 5 months ago

Oh sure, that is probably an even better way, thanks!

ikedas commented 4 months ago

@adam12b1 , if possible, please check the patch in the PR above.