Closed GoogleCodeExporter closed 9 years ago
It's the developer's responsibility to check the content type returned with the
title, which should indicate how to treat the content (i.e. "text/plain" would
need to be escaped, whereas "text/html" will have been sanitized and should be
safe to output unescaped).
Without seeing the feed, I took a look at how the content might trace through
the parser. `_start_title()` assumes that the content is "text/plain", so
without a type attribute it later falls on `lookslikehtml()` to guess what the
content is. `lookslikehtml()` takes a very cautious approach to changing the
content type to avoid data loss. In this case, I expect that `lookslikehtml()`
will return False, the content type will remain "text/plain", and the text
won't be sanitized.
The behavior I've described above is expected, and the type should be available
in the "title_detail" dictionary key, and should be set to "text/plain".
However, if this isn't the case, please provide the URL to a
publicly-accessible Twitter feed that's exhibiting this behavior or reply back
and attach a downloaded copy of the feed.
Original comment by kurtmckee
on 16 Feb 2011 at 1:50
[deleted comment]
Yes it as like you said. However, the documentation on html sanitization is not
_clear_ enough.
An example of a twitter atom search result is -->
http://search.twitter.com/search.atom?q=%40twitter
Original comment by db.pub.m...@gmail.com
on 16 Feb 2011 at 8:11
Alright, I think I'm on the same page now. Yes, the documentation requires an
overhaul; the page you cite still claims that CSS styles are stripped, for
example, which is no longer true. I would be delighted to receive patches or
git pull requests at github [1], otherwise I'll work on the documentation when
I'm able.
[1]: https://github.com/kurtmckee/feedparser/
Original comment by kurtmckee
on 16 Feb 2011 at 9:10
Fair enough. I mean - the 'basic' part of the django 'escape' function could be
used (optionally) (that is replace < > & " ' with their html counterparts) .
Original comment by db.pub.m...@gmail.com
on 16 Feb 2011 at 9:24
Original comment by kurtmckee
on 27 Nov 2011 at 9:36
Fixed in r642 (and the new documentation should be available shortly; I'll
publish the link on the frontpage and announce it on the mailing list).
Original comment by kurtmckee
on 27 Nov 2011 at 10:49
Original issue reported on code.google.com by
db.pub.m...@gmail.com
on 15 Feb 2011 at 8:45