SimpleMachines / SMF

Simple Machines Forum — SMF in short — is free and open-source community forum software, delivering professional grade features in a package that allows you to set up your own online community within minutes!
https://www.simplemachines.org/
Other
583 stars 250 forks source link

[2.1 & 3.0]: Email 7-bit format is not checking for 1000 byte limits #8288

Open sbulen opened 2 months ago

sbulen commented 2 months ago

Basic Information

In the sendmail/mimespecialchars routines, there is no check for a character limit.

I believe the 7-bit format has a 1000 byte limit that should be honored. https://stackoverflow.com/questions/25710599/content-transfer-encoding-7bit-or-8-bit

I suspect this may be causing the "Message has lines too long for transport" error reported in the forum. https://www.simplemachines.org/community/index.php?topic=589251.0

Steps to reproduce

Tricky to reproduce... My host appears to simply strip the 7-bit portion from the email & use the utf8 portion.

User reports getting it when flagging a post as a forum announcement, to trigger an email. Some emails are delivered, some get undeliverable replies with the "Message has lines too long for transport" message.

What is easy to reproduce is to see that SMF's sendmail produces lines longer than 1000 bytes...

E.g., this 198-character line in Ukranian: Восени 2020-го року, я вирішив, що технічно формат проекту не відповідає вимогам часу. Тому рішуче почав перебудовувати сайт на новий движок - WordPress. Планувалось, та й так і сталося, що найбільш

Becomes this 1080 byte line in the 7bit chunk of an SMF email body: Восени 2020-го року, я вирішив, що технічно формат проекту не відповідає вимогам часу. Тому рішуче почав перебудовувати сайт на новий движок - WordPress. Планувалось, та й так і сталося, що найбільш

As a result, the 1000 byte limitation is much tougher on languages with multibyte characters.

Expected result

No response

Actual result

No response

Version/Git revision

2.1.4

Database Engine

MySQL

Database Version

8.4

PHP Version

8.3.8

Logs

Additional Information

Lots of details in this thread: https://www.simplemachines.org/community/index.php?topic=589251.0

sbulen commented 2 months ago

I wonder if we should abandon the 7-bit representation altogether, and just keep the utf8 one.

Otherwise we must insert linebreaks into the source text, which would be ugly...

jdarwood007 commented 2 months ago

I suspect UTF-8 is about 100% easier to do now, and we would have a near 0% chance of breaking anyone's email since every email client in the last 20 years supports UTF-8.

sbulen commented 2 months ago

The good news is that our utf8 works already.

The difficult question is how safe it is to remove the old 7-bit...

You know SOMEBODY out there has users running an ancient AOL client in windows xp...

jdarwood007 commented 2 months ago

I would say for 3.0, remove 7bit, for 2.1 we leave it, breaking the standard.

Sesquipedalian commented 2 months ago

Anyone running an email client that still can't support UTF-8 in 2024 would already be seeing lots of unreadable email messages. After all, if your own email service provider is simply deleting the 7-bit part in transit, @sbulen, it probably isn't the only one. I think we can kill it at our earliest convenience.

sbulen commented 1 month ago

Related discussion in the forum... The 7-bit limitation also affects subject lines, which, for email purposes, are restricted to 100 bytes, which can be ~50 characters in some languages: https://www.simplemachines.org/community/index.php?topic=589350.0