Open leppert opened 5 years ago
@leppert thank you for the issue...
Yes, it is sort of strange...
Can only guess the drop-empty-element
occurs, in code, before the hide-comments
... In other words, at the time empty
is checked, the comment
is still there... and tidy does not re-process the document, after the hide
... or something...
Will try to look further into it...
Meantime, look forward to further feedback... including patches, PR, ... thanks...
@leppert with more testing and debugging, it is more or less as I thought...
As you may know tidy has sort of three phases of operation - parsing, cleaning, output
The drop-empty-element
is in the parsing
stage...
The hide-comments
is in the output
stage...
And there is no re-parsing of the DOM like tree of nodes, at this stage, looking for now potentially empty elements, to drop
them... so they stay...
While this is possible, it requires someone to add the code to do this... see DropEmptyElementa and then DropComments, for example...
Is there sufficient use
case the attack this? Feels quite minor...
Look forward to further feedback... including patches, PR, ... thanks...
@geoffmcl thanks for the update and for looking further into the issue. I agree -- it's hard to imagine this being a large scale problem for users unless there are other, perhaps more common circumstances where nodes are removed in the output
stage. If I find a moment, I'll certainly attempt to submit a PR. Regardless, perhaps this ticket might prove useful should the pipeline and its order of phases ever get reworked for another reason; a bug of this small size is probably best fixed "for free" in the process of serving a larger goal.
I would expect the following two inputs to produce equivalent output. However, the second input, in which a comment is hidden, returns an empty
<p>
tag.