Open Amecom opened 4 years ago
I noticed that sequences like
<div> <span> A <em> B </em> C </span> </div>
It is transformed into
<p> A </p> <em> B </em> <p> C </p>
Instead of something like:
<p> A <em> B </em> C </p>
This causes annoying break lines in the text
You can see an example by parsing this article from an Italian newspaper:
https://www.repubblica.it/tecnologia/sicurezza/2020/04/01/news/zoom_dalla_privacy_ai_troll_tutti_i_guai_dell_app_per_videochat-252877928/
I noticed that sequences like
<div> <span> A <em> B </em> C </span> </div>
It is transformed into
<p> A </p> <em> B </em> <p> C </p>
Instead of something like:
<p> A <em> B </em> C </p>
This causes annoying break lines in the text
You can see an example by parsing this article from an Italian newspaper:
https://www.repubblica.it/tecnologia/sicurezza/2020/04/01/news/zoom_dalla_privacy_ai_troll_tutti_i_guai_dell_app_per_videochat-252877928/