Open Jerry-Ku opened 7 years ago
Issue happens at utils.py package file (Python37\Lib\site-packages\html2text\utils.py) at lines 210, 211, 212. Here are those lines that work: text = config.RE_MD_DOT_MATCHER.sub(r"\1\2", text) text = config.RE_MD_PLUS_MATCHER.sub(r"\1\2", text) text = config.RE_MD_DASH_MATCHER.sub(r"\1\2", text)
These lines originally have 2 extra backslashes, just replacing these 3 lines should fix this issue. Not sure if it could break something else.
Extra slash was added in front of output when two and above '-' were encountered. eg. echo '\<p>-\</p> | html2text -> '-' echo '\<p>--\</p> | html2text -> '\--' Also, if the input string format is '[0-9].[space]', the output will be '[0-9]. ', eg. echo '\<p>.\</p> -> '.' echo '\<p>..\</p> -> '..' echo '\<p>2.\</p> -> '2.' echo '\<p>2. \</p> -> '2\. ' echo '\<p>a. \</p> -> 'a. '