Closed pwang2 closed 3 years ago
@pwang2 thank you for the issue... this indeed seems a tidy5 bug... tidy does not recognized the <template>
tag correctly, if I read the W3C documentation correctly...
While we do try to take notice of developer.mozilla.org
docs, our guide is W3C, but I have found a similar reference in W3C documentation - the-template-element - which indicates the Content model:
allows a considerable range of elements
...
Tidy seems for sure wrong in inserting a <table>
element, if in a <template>
element... but not sure there is a problem with the insertion of newlines, but maybe...
We are shortly to issue a new 5.6 release of tidy - see #600, Nov 23 - and if someone does not work on this soon it may be necessary to extend the milestone
...
Added 3 test files to my repo in_611.html, in_611-1.html, and in_611-2.html, for easy testing...
Further, for in_611-2.html
, which passes Nu Html Checker, appears tidy will move the <template>
out of the <table>
if present... and tidy's output ends up being a very different html, and thus breaks browser rendering, two important issues for tidy...
I will try to find the time to look deeper. It should just involve checking tidy is not in legacy mode, and extending some internal tables... hopefully...
Hope others also gets a chance to look at this, and present a patch or PR... thanks...
Thank you for you detailed analysis, @geoffmcl.
Yes, template will mostly need to remove all the contextual element restriction as it could be any arbitrary HTML to be injected in other places. And it is part of the HTML5 web component specification which will be popularized day by day.
Another trending use case is, nowadays, developers no longer write a full HTML page like they do in the past. Instead, they are writing html snippet like mentioned #569 . To make tidy-html more usable, we may expect some flags like --component
to do a contextless analysis to turn off the rules like missing title
, not approve by w3c
etc.
Just my 2 cents to make tidy more adoptable.
To make tidy-html more usable, we may expect some flags like --component to do a contextless analysis to turn off the rules like missing title, not approve by w3c etc.
This is a toughie. Which rules to ignore, and which to enforce? Or don't enforce anything and become a pretty-printer only if told to? But we can't do that without catching things like missing end tags, etc.
We do have a "community" repository to capture things like specifications for Tidy. It's pretty bare right now. A great contribution to open source doesn't have to be writing C code, but can be helping us write specifications for how Tidy should behave. The current maintainers do a lot of things based on history and haven't paid a lot of attention to specifications, and given our manpower, I'm not pointing fingers given I'm one of these. Yet, it's a good thing to be specification driven.
Can you help? Do you have people that can help? We might not be able to adopt everything right away, but we can work towards it.
How, specifically, should a --component option affect Tidy's behavior? I'm not asking this out of frustration, and I want Tidy to be useful in most contexts. It's just that it's really hard for us to write complete specifications, and that type of help would probably just as valuable as source contributions!
I agree we should strike while iron is hot at this turning point for web development to component architecture.
Could you point me where is the "community" repository you mention? @balthisar I could come up a very draft proposal as well as my experience for using tidy in component development.
The evolution from jslint to jshint to eslint is a good path we could refer to. The open source model works pretty well for eslint. Eslint defines the capability to defined rules based on a parser. The community makes further move to make things work as everyone want. I know as a markup language targets to tolerate mistake(I personally not quite agree with the over promised error-okay nowadays), it is not quite easy to get things work with AST. If we could lower the bar to contribute, it will be a huge win for the project itself.
Post a PR to https://github.com/htacg/community
Moving out the milestone...
Issue fixed. Close now!
echo '<template><tr><td>1</td></tr></template>' | tidy --show-body-only 1 --show-info 0 --indent 1
outputs:
<template>
<table>
<tr>
<td>
1
</td>
</tr>
</table>
</template>
@pwang2 forgive me if I've misunderstood, but isn't this the original (incorrect) behaviour? The <template/>
context makes the orphaned <tr/>
legitimate, legal, and (presumably) intended -- the <table/>
tags should not be interpolated...?
@simonwiles, not sure if you are talking about the output in latest version, the issue has been fixed after the issue created. Only the old behavior is not correct as per spec.
Template should allow any arbitrary tag without requiring the parent context tag, like li, th,. Tidy should not add ul, thead back in the template scenario.
I've just built from a clone of this repo taken a few minutes ago, and I'm still seeing the old, incorrect behaviour...
$ tidy --help
tidy [options...] [file...] [options...] [file...]
Utility to clean up and pretty print HTML/XHTML/XML.
This is modern HTML Tidy version 5.7.16.
$ echo '<template><tr><td>1</td></tr></template>' | tidy --show-body-only 1 --show-info 0 --indent 1
line 1 column 1 - Warning: missing </template> before <tr>
line 1 column 11 - Warning: inserting implicit <table>
line 1 column 30 - Warning: discarding unexpected </template>
line 1 column 11 - Warning: missing </table>
line 1 column 1 - Warning: missing </template>
Tidy found 5 warnings and 0 errors!
<template>
<table>
<tr>
<td>
1
</td>
</tr>
</table>
</template>
Is there something I've missed?
Sorry for the confusion, @simonwiles you are right, the output is still incorrect. I guess I might have use a self build version at that time which do fix the issue. But I have no idea about the output I pasted in the Close issue comment.
any chance this issue could be reopened, then? I notice it's notionally on the 5.7 milestone list, due tomorrow... :)
surely I could. and hope it won't be delayed.
Unfortunatly, the problem is still present today, with latest version (5.8.0):
$ echo '<template><tr><td>1</td></tr></template>' | tidy --show-body-only 1 --show-info 0 --indent 1
line 1 column 1 - Warning: missing </template> before <tr>
line 1 column 11 - Warning: inserting implicit <table>
line 1 column 30 - Warning: discarding unexpected </template>
line 1 column 11 - Warning: missing </table>
line 1 column 1 - Warning: missing </template>
Tidy found 5 warnings and 0 errors!
It seems to be fixed in 5.9.14, tough.
See https://developer.mozilla.org/en-US/docs/Web/HTML/Element/template is valid usage of template. Tidy currently will add a
<table>
wrap to the tr in template. This will break the html rendering