aaronsw / html2text

Convert HTML to Markdown-formatted text.
http://www.aaronsw.com/2002/html2text/
GNU General Public License v3.0
2.65k stars 413 forks source link

Problem with <font> tags --> not displaying markdown syntax #72

Open kcmoffitt opened 11 years ago

kcmoffitt commented 11 years ago

Hi, first time poster here. I apologize in advance for not following issue-submission protocol that may exist.

I am working on converting corporate annual reports (default format html, yet no standardized form of html) to text with markdown syntax. HTML2Text works perfectly for and tags, but not for type tags. In these instances, the text is displayed with no markdown tags. I am a novice Python programmer and I cannot overcome this issue on my own.

This research is very important as it will expose certain companies that were either negligent or incompetent in the years before and surrounding the recent financial meltdown. Any help will be greatly appreciated.

Here is some sample html that exhibits the problem I described above...

https://docs.google.com/document/d/1PUSJWCfnddFCMzb_qiIg7dQYxwyBJpsh-T_cR55oa-A/edit?usp=sharing

mcepl commented 11 years ago

@ordinaryProfessor this is not a good method of sharing HTML (I am afraid Google Docs do some conversion about it). Do you want to say that your example is http://mcepl.fedorapeople.org/tmp/SampleHTML.html ?