Closed LetThereBeDwight closed 3 days ago
This change shouldn’t be needed, see my comment on #171
I've commented as well - the issue isn't with building it this way, the issue is in parsing it when it comes in this way via raw spec.
Please give us an example in raw spec that causes this problem. The parser should break it into the content type and params
Okay apologies I now see what was happening (kinda) - I'm actually on a different version (0.3.1) since that's the latest in hex.
When testing like this against 0.3.1 in test/mail/parsers/rfc_2822_test.exs
(apologies for the long string, using the multi-line string resulted in different parser results with really bad headers - this is what I get directly when pulling the email from s3).
test "content-type html with explicit charset" do
message =
parse_email("Return-Path: <from@from.com>\r\nDate: Tue, 09 Jul 2024 12:53:58 +0000 (UTC)\r\nFrom: \"from name\" <from@from.com>\r\nReply-to: \"reply to name\" <reply_to@prokeep.com>\r\nTo: to@to.com\r\nSubject: SR0012084728 - RFQ\r\nMIME-version: 1.0\r\nContent-type: text/html; charset=UTF-8\r\nContent-transfer-encoding: quoted-printable\r\nReturn-Path: from@from.com\r\n\r\n<html><head></head><body><p>Test</p></body></html>")
IO.inspect message, label: "Message"
IO.inspect message.headers["content-type"], label: "Content-Type"
end
We get the output:
Message: %Mail.Message{
headers: %{
"content-transfer-encoding" => "quoted-printable",
"content-type" => "text/html; charset=UTF-8",
"date" => ~U[2024-07-09 12:53:58Z],
"from" => {"from name", "from@from.com"},
"mime-version" => "1.0",
"reply-to" => "\"reply to name\" <reply_to@prokeep.com>",
"return-path" => "from@from.com",
"subject" => "SR0012084728 - RFQ",
"to" => ["to@to.com"]
},
body: "<html><head></head><body><p>Test</p></body></html>",
parts: [],
multipart: false
}
Content-Type: "text/html; charset=UTF-8"
SO - apologies for mislabling the version, which made me then go and try to update and realize that 0.4.0 isn't actually on Hex - is there a timeline for that or have you stopped pushing to hex entirely?
Granted I wouldn't have noticed these differences between get_text
and get_html
otherwise BUT I still maintain that those two parsers should align in any case, and the initial version of using put_content_type
is also valid and generates a valid Mail.Message and email.
Understood if you don't wanna take on this PR as a result as there are other facilities here that work around the issue, but I do pretty firmly believe those two matches should align.
TLDR:
1) My apologies on not recognizing the version I had in the system I was using that ran into this issue.
2) When will 0.4.0 be pushed to hex (if ever)?
3) Do you believe that it's valid for the functions get_html
and get_text
to be misaligned in the way they match content type specifications? If not, this PR is valid in it's current state and that usage of put_content_type
shows the issue in a rather valid way to my eye. If so, then feel free to close.
Thanks @LetThereBeDwight I have published 0.4.0 You are right, get_text was inconsistent and I’ve changed that in #174 along with a number of doctests to check for consistency in our examples
Closes #171.
Changes proposed in this pull request
Aligning
get_html
matches with get_text to ensure that we match on the same types of content type specifications between the two, except for the obvious actual content type part differences.