ageitgey / node-unfluff

Automatically extract body content (and other cool stuff) from an html document
Apache License 2.0
2.15k stars 221 forks source link

How to get HTML content of the text? #87

Open malcommac opened 6 years ago

malcommac commented 6 years ago

Is there a way to return the content of textproperty with all the html stuff? Thanks

vincenzo commented 5 years ago

That's essentially the first question that came to mind for me, too. @ageitgey mentions that this package might be useful to build an Instapaper clone; however, tools like Instapaper or Pocket retain the text formatting from the body of the article (headings, subheadings, italic, bold, links [sometimes transformed into footnotes], etc.).

ageitgey commented 5 years ago

There isn't right now. It would take some re-work to make the text retain basic formatting tags.

vincenzo commented 5 years ago

Thanks for replying @ageitgey. Any chance you could point me in the right direction if I wanted to take that upon myself?