Uncodin / bypass

Skip the HTML, Bypass takes markdown and renders it directly on Android and iOS.
http://uncodin.github.com/bypass/
Apache License 2.0
1.51k stars 193 forks source link

Add support for HTML entities #131

Open myell0w opened 11 years ago

myell0w commented 11 years ago

When trying to parse the following text, the parser crashes in

Bypass::Parser::handleSpan with the Type LINK when accessing elit->second

int pos = atoi(strs[0].c_str());
std::map<int, Element>::iterator elit = elementSoup.find(pos);

Element element = elit->second;

I can reproduce this on iOS, the text that makes the parser crash is:

Its all fine and dandy. But then the UPS man BANGS your wife.

[&#3232;\_&#3232;]
(http://www.assignmentx.com/wp-content/uploads/2012/10/SOUTH-PARK-Season-16-Insecurity1.jpg)

screenshot_22 06 13_13_45-3

myell0w commented 11 years ago
&#3232;\_&#3232; 

is the Unicode character ಠ_ಠ

heydamianc commented 11 years ago

Changing the title of this issue to more accurately reflect the true nature of the problem.

heydamianc commented 11 years ago

An HTML entity callback will need to be added here.

heydamianc commented 11 years ago

A simpler use case:

[&lt;]()
myell0w commented 11 years ago

Any hints on where to start adding support for HTML entities? I might take a look.

heydamianc commented 11 years ago

I'm actually not sure how spread out HTML entity parsing would be, but there is a callback here if you're using the split up project already, or here if not. If you look at the bottom of the same file, you can see how to integrate it if you're unfamiliar with it.

The interesting thing about this I don't think that the callback is invoked inside of a link like the initial failure has. I only played around with it briefly before starting down the path of splitting the projects up. It may be a bigger task, similar to getting inline link parsing...

Let me know if you need more information and I'll see if I can help more.

insanj commented 9 years ago

This has been causing pretty common crashes for me, in the meantime I've been using this NSString+HTML category to decode the entities before processing them.