Closed ThomasLandauer closed 3 years ago
Without investigating I couldn't say either, haha :) let me know if you manage to have a look before I do here.
I didn't read RFC 5322 from start to end. But as it says (see above): "with no further processing", I'm 99% sure that it's perfectly legal to have several whitespaces in the subject. After all, you certainly can have several whitespaces in the message's body ;-)
The problem comes from the fact that you split the subject on whitespaces into MimeLiteralPart
s. What's the purpose of that? Why don't you just keep it as a single string?
It's because 'Subject' can actually contain RFC 2047 mime-encoded parts... that may be simply considered an 'extension' to 5322, and you may still be right regardless though (haven't looked for clues why I'm not preserving more than one whitespace, or if it's intentional).
The reason this happens is this line here:
Which as you've observed doesn't preserve multiple whitespaces in subjects (because the separator token is '\s+' as well). This works well for RFC 2047 encoding when they're next to text or next to each other, and how that's supposed to work.
I'm feeling torn on this one...
On the one hand, you're right -- the RFC doesn't specifically say they should be replaced by a single space as far as I can tell, but on the other hand I'm not sure most users would specifically write code to handle having the extra spaces either, and it may be more of an expectation that a subject wouldn't.
Open to hearing arguments on this one :).
Also I'm not sure in this case that my usual Thunderbird test is valid... in this case I'm looking at their "display" which is different from their parsing also. Probably the Thunderbird 'positive' is more just that it wasn't handled specifically in that case, or that everywhere else it's being displayed as html anyway and supporting multiple spaces would be more work.
$parser->getHeader('subject')
does keep the whitespaces.Yeah, I already agreed with all of those -- my remaining question is about the value in changing what's there and user expectation.
Well, if you're asking me: You cannot be proud for following every RFC, and in this case ask for "user expectation"...
Hahaha... well you got me there I guess :+1:
Released in 1.3.0
If I have this in the email (notice the two spaces):
...then
$message->getHeader('subject')->getValue()
and$message->getHeaderValue('subject')
both give me this (notice the single space):I didn't look at your code yet, I wanted to ask you in advance: Are you doing this on purpose?
RFC 5322 says: