htacg / tidy-html5

The granddaddy of HTML tools, with support for modern standards
http://www.html-tidy.org
2.71k stars 417 forks source link

Tidy mangles whitespace in pre and code tags #1094

Open matheusmoreira opened 1 year ago

matheusmoreira commented 1 year ago

It appears tidy does not preserve spaces in in <pre> and <code> tags. It renders the entire tag contents as a single line, destroying code formatting.

$ HTML_TIDY=/dev/null tidy -q <<EXAMPLE
<pre>
  <code>
if (true) {
    return false;
}
  </code>
</pre>
EXAMPLE

<!DOCTYPE html>
<html>
<head>
<meta name="generator" content=
"HTML Tidy for HTML5 for Linux version 5.9.14">
<title></title>
</head>
<body>
<pre>
  <code> if (true) { return false; } </code>
</pre>
</body>
</html>

I scoured the manuals for an option that would fix this but found none. The closest would be --literal-attributes but using it did not work, as expected.

step- commented 1 year ago

It looks like a regression from 5.8.0

$ HTML_TIDY=/dev/null tidy -q <<EXAMPLE
<pre>
  <code>
if (true) {
    return false;
}
  </code>
</pre>
EXAMPLE
line 1 column 1 - Warning: missing <!DOCTYPE> declaration
line 1 column 1 - Warning: inserting implicit <body>
line 1 column 1 - Warning: inserting missing 'title' element
<!DOCTYPE html>
<html>
<head>
<meta name="generator" content=
"HTML Tidy for HTML5 for Linux version 5.8.0">
<title></title>
</head>
<body>
<pre>
  <code>
if (true) {
    return false;
}
  </code>
</pre>
</body>
</html>
Seirdy commented 10 months ago

I've identified https://github.com/htacg/tidy-html5/commit/91f29ea7b88a0f3a810d011f958ea9dd935bd65b as the source of this regression.