Closed chuckblake closed 11 years ago
The fix is to prune the input for invalid UTF-8 Bytes. http://po-ru.com/diary/fixing-invalid-utf-8-in-ruby-revisited/
Not sure if that should be Griddler's responsibility or the end user's though.
@jayroh did a little work on UTF-8 stuff a while back. He might have input here.
@chuckblake these are tough to diagnose without getting into the guts of the exact message that caused the error. I had seen this a few times on a project and had to dig through the stack and params POST'ed into the controller. Do you have that info? If so would you be able to write a failing test for this?
I wrote a post on our blog, actually, about how to tackle something like this -> http://robots.thoughtbot.com/post/42664369166/fight-back-utf-8-invalid-byte-sequences
It's not specific to griddler, but instead the UTF-8 issue.
http://robots.thoughtbot.com/post/42664369166/fight-back-utf-8-invalid-byte-sequences
@jayroh This helps - but only a bit. If you receive emails via sendgrid and these contain special characters (Umlaute), these are cut out. I do not know if this is a problem of sendgrid or griddler?!
Here is an gist of the girddler email object: https://gist.github.com/dorra/6354910
Sounds like we need to open up a SendGrid bug report. /cc @scottmotte
I'm going to give it a shot to see if I can replicate this but it'll be tough.
@chuckblake @dorra are you guys of the opinion this is as basic as an umlaute that's causing this?
Here is a subject line from one of the emails that are erroring out on me:
"The Great TOH Giveaway is Back! We’re giving away $530,324 in prizes." I think it's the ' in we're that is causing the problem.
Chuck
On Thu, Aug 29, 2013 at 11:32 AM, Joel Oliveira notifications@github.comwrote:
I'm going to give it a shot to see if I can replicate this but it'll be tough.
@chuckblake https://github.com/chuckblake @dorrahttps://github.com/dorraare you guys of the opinion this is as basic as an umlaute that's causing this?
— Reply to this email directly or view it on GitHubhttps://github.com/thoughtbot/griddler/issues/72#issuecomment-23498526 .
@dorra @theycallmeswift @calebthompson I've pushed a commit above :point_up: in a branch to try and address this.
@chuckblake come to think of it I'm not sure we're sanitizing subjects yet so this is a great heads up. I'll take your use case (thank you for providing that, by the way - greatly appreciated) and see if I can get some test coverage around that case.
Commit lgtm. Good work.
(concerning "If you receive emails via sendgrid and these contain special characters (Umlaute), these are cut out.", I've filed a ticket and we are looking into it here at SendGrid)
Closing after fix in 34e01c0f54aabf042991344861872cca102dafd8
On some incoming emails, I'm receiving the following error when using SendGrid inbound parse - > Griddler.
ArgumentError: invalid byte sequence in UTF-8
griddler/emails#create
vendor/bundle/ruby/1.9.1/gems/mail-2.4.4/lib/mail/core_extensions/string.rb:4
any ideas or suggestions on how to fix this?
here's some additional information from the backtrace: vendor/bundle/ruby/1.9.1/gems/mail-2.4.4/lib/mail/core_extensions/string.rb:4:in
gsub' vendor/bundle/ruby/1.9.1/gems/mail-2.4.4/lib/mail/core_extensions/string.rb:4:in
to_crlf' vendor/bundle/ruby/1.9.1/gems/mail-2.4.4/lib/mail/header.rb:39:ininitialize' vendor/bundle/ruby/1.9.1/gems/griddler-0.5.0/lib/griddler/email_parser.rb:45:in
new' vendor/bundle/ruby/1.9.1/gems/griddler-0.5.0/lib/griddler/email_parser.rb:45:inextract_headers' vendor/bundle/ruby/1.9.1/gems/griddler-0.5.0/lib/griddler/email.rb:59:in
extract_headers' vendor/bundle/ruby/1.9.1/gems/griddler-0.5.0/lib/griddler/email.rb:21:in `initialize'