Open johanovic opened 4 years ago
This is done by the beautifulsoup4 library. I don't want to add lxml or any other dependency so...
do you have any proposal on how to achieve this?
Would be a really nice enhancement, I am experiencing the same thing. For those coming to this issue, you can try the following:
message = inbox.get_message("<SOME EMAIL ID>")
soup = message.get_body_soup()
delimiter = "\n\n"
for line_break in soup.findAll('br'):
line_break.replaceWith(delimiter)
soup.get_text()
I regularly extract the text of an html message. The current parsing method (below) fails to insert linebreaks where one would expect them. Is it possible to improve this? I could do this directly in lxml (with itertext), but it might be a good enhancement for the library as a whole.