rust-syndication / rss

Library for serializing the RSS web content syndication format
https://crates.io/crates/rss
Apache License 2.0
419 stars 52 forks source link

Parse invalid XML #69

Open andy128k opened 6 years ago

andy128k commented 6 years ago

It is not a rare situation to meet broken RSS. I have two examples:

  1. Unescaped ampersand: <title>Here & there</title>
  2. HTML entities: <title>Text with &laquo;angle quotes&raquo;</title>

Formally these documents are invalid, but practically it could be nice to parse them too.

For example Firefox skips such items but does not discard whole RSS.

Currently this crate discards such inputs.