I'm doing some tests regarding encoding, and I'm not understanding some results specifically. Take this proof-of-concept of an email sent with Windows-1255 encoding:
it 'does not corrupt hebrew characters if charset is set' do
charsets = {
to: 'UTF-8',
html: 'utf-8',
subject: 'UTF-8',
from: 'UTF-8',
text: 'Windows-1255'
}
expect(body_from_email({ text: "Hell\u05d1" }, charsets)).to eq 'Hellב'
end
This test fails and the result is 'Hellá', because the Email class forces the encoding of the body to ISO-8859-1 and then converts it to Unicode. So I want to ask: When the email comes in a different encoding than UTF-8 or ISO-8859-1, will the text be garbled in the body variable like it is on the test? And if yes, any idea on how to solve this problem?
I'm doing some tests regarding encoding, and I'm not understanding some results specifically. Take this proof-of-concept of an email sent with Windows-1255 encoding:
This test fails and the result is
'Hellá'
, because the Email class forces the encoding of the body to ISO-8859-1 and then converts it to Unicode. So I want to ask: When the email comes in a different encoding than UTF-8 or ISO-8859-1, will the text be garbled in the body variable like it is on the test? And if yes, any idea on how to solve this problem?