aaronsw / html2text

Convert HTML to Markdown-formatted text.
http://www.aaronsw.com/2002/html2text/
GNU General Public License v3.0
2.61k stars 412 forks source link

Google docs #21

Closed nushoin closed 12 years ago

nushoin commented 12 years ago

This patch fixes several issues that I had with the handling of Google Docs.

Supporting the conversion of Google Docs to Markdown is more or less complete now, at least with regard to simple documents. It is not fully compatible with 'standard' html document, so this functionality is still protected by command line options, and defaults to false (btw sorry about the problem with the options variable).

Note that only 'new style' Google Docs are supported. This should not be a problem as Google offers to convert old-format documents to the new format whenever you open them.

Thanks, Yariv

aaronsw commented 12 years ago

I merged it, but would you mind adding a test case or two just to make sure there aren't any serious regressions? Would be great to avoid another obvious fail like the options thing again.

aaronsw commented 12 years ago

I merged it, but would you be willing to add a test case or two? It'd be really good to make sure there's no obvious regressions, like the options issue we had before.

nushoin commented 12 years ago

Cool, I will try to cook something up.