savonrb / nori

XML to Hash translator
MIT License
245 stars 74 forks source link

Scrub invalid characters from source XML #72

Closed alethea closed 8 years ago

alethea commented 8 years ago

Nori::Parser has a new option, :scrub_xml, which defaults to true, when it's true, the parser will clean invalid or undefined characters from the string using String#scrub if it's available (Ruby 2.1 or later) or String#encode otherwise. This should allow documents containing invalid characters to still be parsed.

tjarratt commented 8 years ago

Thanks for contributing this pull request @alethea!

In particular, I appreciate you changing the old stubbed tests (which were probably in the wrong place) to instead test the behavior that we care about (scrubbing out invalid characters) rather than the implementation (sending a particular message to the string). That's awesome. I care a lot about good testing hygiene, and that's something I've wanted to invest in since I started maintaining, but never really managed to dedicate enough time to. Thank you. A thousand times, thank you.

alethea commented 8 years ago

You're welcome! Thanks for the speedy response and review.