Open pdurbin opened 5 years ago
@pdurbin thanks for the comment...
While it is not strictly correct to say the comment "<!-- for mobile -->" is moved by Tidy
, I sort of understand what you mean... a newline in the input, is not in the output... but...
While you may have just noticed it - welcome to tidy
- this output has been like that since the earliest release, Raggett's 4th August 2000, ie tidy-2000
- nearly 20 years ago...
So I would suggest, this is a - well established feature of tidy
- TM like - continued in next
, 5.7.22, so...
What is the use
case for trying to bring back this meaningless, to html, newline
? Is it important?
Is the case strong enough to create a new Pretty Print
output option? Called what? To do what, exactly? Full specs...
At present this looks like Won't Fix
... but look forward to further feedback, comments, etc, etc, - thanks
@geoffmcl hi! Thanks for your thoughtful reply.
Yes, comments are meaningless to browsers but I had people in mind. Over at https://github.com/IQSS/metrics.dataverse.org/pull/5 I added the first HTML page to a project and suggested in CONTRIBUTING.md that we could use tidy
to format our code. (I provide a config file.) However, I can anticipate contributors asking me, "Why does tidy move my comments around?" So here I am, asking why. To be clear, I'm also fine with any sane workaround. My workaround for now has been to delete all my HTML comments but I think this is sub-optimal. :smile:
For what it's worth, while I was researching a solution, I found this question, which seems to be related: https://stackoverflow.com/questions/537112/html-tidy-dont-move-those-comments
I hope this helps. I'm not sure if I've answered all of your questions but I'm happy to ramble on. Please let me know what you think. Thanks!
@pdurbin thank you for the compliment... I do try hard to understand issues presented...
But I do not yet understand why metrics.dataverse.org
features in this...
Current libTidy
does not move comments around... in general it will output them in the order they occurred in the input... for instance running the sample input in the stackoverflow
post, using current tidy
, will not yield the results shown...
In general I would reply to anyone that asks "Why does tidy move my comments around?"
, answer, oops, libTidy
does not do this... FULL STOP!
A broader response would be that the purpose of tidy
, from tidy, is to ... corrects and cleans up HTML and XML documents by fixing markup errors and upgrading legacy code to modern standards.
... a simple promise, but very difficult, in an ever advancing html world... but at a very minimum, you can expect some relining of your input...
A sort of technical response would be libTidy
is like a browser, it inputs the stream to a tree, only keeping important nodes, and text, attached to those nodes, discarding others, non-attached, spacey only, stuff... and outputs that tree
to an output file... that means things like -
<meta charset="UTF-8">
<!-- for mobile -->
<meta name="viewport" content=
"width=device-width, initial-scale=1">
are stored in the libTidy
tree, and sent to the output, as -
StartTag meta charset="UTF-8"
Comment
StartTag meta name="viewport" content="width=device-width, initial-scale=1"
A simple answer might be that tidy
is not a tool that maintains your input lines... it might in general appear to do so, but that is far from the case...
It inputs your stream, tidies it, and outputs a completely relined product, which we hope you like... there are quite a number of options modifying that output...
Try an extreme example, --vertical-space auto
, to see how input newlines
can be totally ignored in the output...
I do not ask you to ramble on... sorry...
I go back to my first comment, is this an issue? Can this be closed?
Seek meaningful feedback, comments... thanks...
You're right. I tried --vertical-space auto
and this is definitely not what I want. 😄 It removes newlines, making the HTML unreadable to people. For the project I'm working on (that "metrics" page, I mentioned), I am editing HTML directly, and I'm shopping around for a command line tool that can be used to maintain consistency of HTML formatting, especially as other developers jump in.
My thought was, "I'll suggest to other developers that they can use Tidy with a config file I provide them with rules about how many spaces to indent, etc."
But you are saying that Tidy is not a tool for maintaining input files. This makes me a little sad. 😞
Perhaps a better tool for my use case would be html-beautify
, which I read about at https://github.com/beautify-web/js-beautify#css--html . I haven't tried it yet because I thought I'd give Tidy a try first. I've known about Tidy for years and this seemed like a good opportunity to use it.
It sounds like you object to me saying that Tidy moves my comments around. I'm just trying to imagine how other developers would react to me suggesting that they try Tidy for the project I mentioned where we are editing HTML directly. They would probably say something like, "Why does Tidy move my comment to the end of the previous line? My comment is about the following line, not the previous line." Can we agree that Tidy moves comments to the very end of the previous line, removing all whitespace? I'm not how to accurately describe what Tidy does, what you say it has been doing for 20 years. From my perspective it's as if Tidy is saying, "Comments will be associated with the previous line rather than the following line." To me this is backward. I was hoping Tidy would have a flag like --comments-for-following-line
to override what is, to me, surprising behavior. That is to say, I write comments like this:
<!-- for mobile -->
<meta name="viewport" content="width=device-width, initial-scale=1">
I don't write comments like this:
<meta name="viewport" content="width=device-width, initial-scale=1"><!-- for mobile -->
Maybe some people do. 😄 It's a free country. 😄
I really appreciate you reading all this! Tidy seems great and again, my workaround is simply to not include any comments in HTML. To me, this is not a great solution, though, which is why I opened this issue.
@pdurbin have you checked comments, other than in the <head>
? Maybe you'll be in for another surprise...
Don't object
to you saying that Tidy moves my comments around
... as you say, it is a free country... ;=))
I took pains to try to explain that technically that is not what happens... but I seem to have failed... oh, well...
Not sure I understand the simple statement ... Tidy is not a tool for maintaining input files...
, which you imply I suggested... This makes me sad...
I quoted from the aims of tidy... see http://www.html-tidy.org/ ...
When I was very active in web content creation, I used it on nearly every one of some 2,500 files... I hope others still do today... because they see some benefit...
But does it maintain input
files? Well, sort of, no! Or yes... depends what you mean...
It generally tidy
ignores most space in the input, except where such space is significant... like <pre>
, <script>
, etc, etc... It generates, hopefully, publishable, fixed, valid, ... output files, with a consistent relining
, and spacing
of the results... there are some options that influence this...
Certainly, if you are not happy with the final results, then maybe tidy
is not what you need, want, ...
Or you can advocate for a new option, like --comments-for-following-line
... need a full spec for this... <head>
, <body>
, cases, docs... etc... give a use
case...
I do not think the one sample/output shown is sufficient -
<meta charset="UTF-8"><!-- for mobile -->
<meta name="viewport" content="width=device-width, initial-scale=1">
Question: What to do about the following? Same? Or different option(s)...
<body>
<!-- begin header -->
<h1>Header</h1>
<!-- begin content -->
<p>Content</p><!-- end content -->
<!-- begin tail -->
<p>tail</p><!-- end
tail -->
<!-- other variations -->
</body>
And see how this changes, if say the -i
option is added... lots to explore, understand, decide... tidy moves comments
is too broad...
Seek feedback, comments, even patches, PRs, etc... thanks...
Hmm, I forgot that some people like to put comment like <!-- end content -->
all over their HTML markup. I'm not one of those people (again, I put comments above the line I'm talking about) so I might need to think about this some more. Thanks for reminding me of this!
I looked around in https://github.com/htacg/tidy-html5-tests a bit and couldn't find any tests specific to comments and newlines. This, to me, seems like a logical place to start, to add or review a test that asserts the current behavior.
@pdurbin thank you for your continued feedback, and investigation, research... into the tidy-html5-tests...
While I too think there are no specific to comments and newlines
tests, that I can see, there are some 64 testbase\*.html
input files that have comments...
So potentially, any change in the current comment/newline
output situation, would probably show up in 1, or more, of these... ie fail a regression
test... phase 2 - compare expected... but not sure...
But can not see, understand, this is the logical place to start
for this issue???
Somehow I agree, where we put a comment, does matter to the human reader, and libTidy
has made choices...
Which may matter to the humans, but not to the valid html output... choices...
As I ask, is there a new option here? Give SPECS, etc, etc...
We have one case in the <head>
, different in the <body
, and influenced by the indent option...
As previously indicated, tidy moves comments
just does not cut it... can not fix that... directly...
As stated, seek further feedback, comments, even patches, PRs, etc... thanks...
hello, +1 for Tidy leaving comments in place! please, please, please it's so difficult to leave them where they are? just treat them as if they are a tag exactly like others...
Hello,
In my case it can be a problem when I need to commit a single line with Git after Tidy processed the file, let's say for a hot fix. If I don't want the comment to be commited for now (for whatever the reason) I get an unclosed comment on that line.
The workaround I found:
So you can play with your code and the wrap value in order to achieve that (if the wrap value is an option for you). Of course it won't be possible in all cases.
I'm using Tidy 5.6.0.
wow IT'S SO MANY TIME since I commented this thread... and nothing happened... it's so sad a good project like this does not consider user's requests
I guess you mean it's so sad free softwares don't have the necessary resources to do so...
By the way it's on the 5.9 milestone which has some pre-releases already. It's a slow pace, and it's okay like that.
Hi! I noticed that Tidy moves HTML comments to the previous line. Is there a way to prevent this? I'm using Tidy 5.2.0, packaged with the latest Ubuntu LTS, 18.04. Here's a "before" and "after" to show how the comment "
<!-- for mobile -->
" is moved by Tidy:Before (HTML comment above the line the comment is about)
After (HTML comment moved to end of previous line)