zendframework / zend-mime

Mime component from Zend Framework
BSD 3-Clause "New" or "Revised" License
62 stars 37 forks source link

Quoted Printable: Prevent that the next line's first character is a dot #4

Closed steffenweber closed 8 years ago

steffenweber commented 8 years ago

At least when used in combination with Exchange, Microsoft Outlook strips dots that appear at the beginning of a new line. We try to workaround this bug by wrapping early and thus preventing lines from starting with a dot.

See http://engineering.como.com/ghost-vs-outlook/

steffenweber commented 8 years ago

Thanks!

eth8505 commented 8 years ago

@steffenweber this actually breaks a lot of other things, as you don't make sure you're not breaking any qp-octets in the process. There needs to be a similar check as is done here to ensure we're not breaking within a qp-octet. Copying the block in lines 123-127 down below your code fixes the problem, but I don't know how @weierophinney feels about duplicate code here.

weierophinney commented 8 years ago

I'm fine with duplication; these are different concerns that just happen to use the same/similar solutions. Please submit a patch for review. On Apr 22, 2016 11:41 AM, "Jan-Simon Winkelmann" notifications@github.com wrote:

@steffenweber https://github.com/steffenweber this actually breaks a lot of other things, as you don't make sure you're not breaking any qp-octets in the process. There needs to be a similar check as is done here https://github.com/steffenweber/zend-mime/blob/38e07a18b4aa905c0fceca2c53b4f82facc8a871/src/Mime.php#L123 to ensure we're not breaking within a qp-octet. Copying the block in lines 123-127 down below your code fixes the problem, but I don't know how @weierophinney https://github.com/weierophinney feels about duplicate code here.

— You are receiving this because you were mentioned. Reply to this email directly or view it on GitHub https://github.com/zendframework/zend-mime/pull/4#issuecomment-213501293

eth8505 commented 8 years ago

@weierophinney I have a fix for the exact problem, but fixing the problem like it's done here will cause more problems in rare cases. Moving dots to the next line until there is no dot could break lines containing only dots. Don't like contributing solutions with known problems...

eth8505 commented 8 years ago

@weierophinney I have looked into this issue further, and I can't seem to reproduce outlook 2013 breaking if there is a dot at the start of a line, as long as the underlying protocol implementation does the dot-stuffing as expected. @steffenweber is there any way that you could let me know what your exact test-case was?

steffenweber commented 8 years ago

We received a user complaint about broken links in our emails. The cause were missing dots in the hostname part of URLs. Those missing dots appeared after a line-wrap in the QP-encoded message. The user said that he was using Outlook 2010 in combination with Exchange Server 2010. I did and do not have access to Outlook and/or Exchange and did not reproduce this bug myself but according to http://engineering.como.com/ghost-vs-outlook/ this is a known Outlook/Exchange bug (that maybe only happens with specific configurations).

Perhaps the problem mentioned by @eth8505 can be fixed by moving the Outlook workaround a few lines up above the block "Ensure we are not splitting across an encoded character". However, I'd find it helpful to have a concrete example / test-case that we are trying to fix. Could you please provide an example or test-case where qp-octets are currently broken, @eth8505?

eth8505 commented 8 years ago

@steffenweber Do you actually have any other sources on the web describing this bug? I can't seem to find anything on the matter except the one post you provided. Generating a testcase for the broken octets is rather simple. Just add any non-ascii/multibyte character before a dot you're trying to prevent from being rendered on a new line. However, I put together a phpunit test for you here

steffenweber commented 8 years ago

I know about the user complaint and the blog post. I'm not aware of other sources that mention this issue.

Thank you for the test-cases. The first test-case is fixed by moving the workaround a few lines up as proposed in my previous comment. The second test-case shows that the Outlook workaround is applied unnecessarily even if the situation is hopeless (a whole line full of dots) but the result does not seem to be wrong (only a bit inefficient). The third test-case does not work without the Outlook workaround either (Zend-Mime applies rtrim to remove whitespace) so it seems to be unrelated (if it is a bug at all).

I've created a branch with the proposed fix for you to test. The change does not break any unit-tests but I'm not an expert on quoted-printable encoding: https://github.com/steffenweber/zend-mime/commits/qp-doesnotbreakoctets

Regarding test-case 2: The desired result can be achieved by appending && !preg_match('#^\.+$#', substr($str, 0, $ptr)) to the while condition of the Outlook workaround. I'm not sure whether that's worth it. Maybe there is an easier fix? Or we leave it as is.

eth8505 commented 8 years ago

I'm sorry I didn't get around to testing it until now. Appears to be working. I think you should leave the preg out. I assume the original implementation explicitly stuck to basic string functions for performance reasons.

What I don't understand about the fix in general, is why it got merged in the first place. I remember lots of things not getting added in the past due to not conforming to RFCs. Any insights @weierophinney ?