thoughtbot / griddler

Simplify receiving email in Rails (deprecated)
http://griddler.io/
MIT License
1.38k stars 199 forks source link

Charset handling for hebrew characters #284

Closed morenobryan closed 4 months ago

morenobryan commented 7 years ago

I'm doing some tests regarding encoding, and I'm not understanding some results specifically. Take this proof-of-concept of an email sent with Windows-1255 encoding:

it 'does not corrupt hebrew characters if charset is set' do
  charsets = {
    to: 'UTF-8',
    html: 'utf-8',
    subject: 'UTF-8',
    from: 'UTF-8',
    text: 'Windows-1255'
  }

  expect(body_from_email({ text: "Hell\u05d1" }, charsets)).to eq 'Hellב'
end

This test fails and the result is 'Hellá', because the Email class forces the encoding of the body to ISO-8859-1 and then converts it to Unicode. So I want to ask: When the email comes in a different encoding than UTF-8 or ISO-8859-1, will the text be garbled in the body variable like it is on the test? And if yes, any idea on how to solve this problem?