Closed NiccoMlt closed 1 year ago
I'm not able to repro this in 1.15.4. See this example, returns:
<div>
<pre><span><b><u><span>TEST</span></u></b></span></pre>
</div>
jsoup does test if an element is in a <pre>
(in Element#preserveWhitespace()
) and will preserve textnode formatting; and should not be otherwise formatting elements. There is a limit (6 up levels) of stack depth as an optimization for serialization time, but that wouldn't be impacting in this instance. I guess this issue was resolved in one of the pretty-print fixes in 1.15.4 but haven't checked yet.
Can you review with 1.15.4? If you find other cases where's it's not working as desired, happy to take a look.
Hi, thank you for your answer, you are right about the minimum example, it seems to be fixed.
Sadly, I'm still experiencing the problem when moving to my acutal document; I cannot provide the full document, but I can provide another example:
<div>
<pre><span><b><u><o:p>TEST</o:p></u></b></span></pre>
</div>
The following code under Jsoup 1.15.4 will be formatted as:
<html>
<head></head>
<body>
<div>
<pre><span><b><u>
<o:p>TEST
</o:p></u></b></span></pre>
</div>
</body>
</html>
Note that I replaced the <span>
tag to an Office-namespaced paragraph tag <o:p>
.
HTML documents with these tags are usually produced by tools like Microsoft Word and Microsoft Outlook.
Thanks for the updated detail -- fixed
Hi, apparently Jsoup formats the content inside a
<pre>
tag, resulting in a non-equivalent rendering. Given the following HTML:And running the following Java code
the result is
I'm using latest 1.15.3 version