Open David-Apps opened 3 years ago
On 10/24/21, David-Apps @.***> wrote:
Tidy removes the line breaks from the contents of a pre element when that element contains another element such as the code element. I expected tidy to preserve the line breaks inside the pre element (except perhaps a line break that immediately follows the
tag).I used HTML Tidy for Linux/x86 version 5.9.17.
looks like a regression -- 5.7.54 does the right thing:
$ tidy /tmp/x.htm Info: Document content looks like HTML5 No warnings or errors were found.
<!DOCTYPE html>
This is the Panel
constructor:
function Panel(element, canClose, closeHandler) {
this.element = element;
this.canClose = canClose;
this.closeHandler = function () { if (closeHandler) closeHandler() };
}
5.9.17 doesn't: $ ./tidy-html5/build/cmake/tidy.exe /tmp/x.htm Info: Document content looks like HTML5 No warnings or errors were found.
<!DOCTYPE html>
This is the Panel
constructor:
function Panel(element, canClose, closeHandler) {
this.element = element; this.canClose = canClose; this.closeHandler =
function () { if (closeHandler) closeHandler() }; }
I search bug commit by git bisect.
I get result.
91f29ea7b88a0f3a810d011f958ea9dd935bd65b
Head: 91f29ea HTML Tidy now parses HTML non-recursively.
Tags: 5.9.8-next (1), 5.9.9-next (2)
91f29ea7b88a0f3a810d011f958ea9dd935bd65b is the first bad commit
commit 91f29ea7b88a0f3a810d011f958ea9dd935bd65b
Author: Jim Derry <balthisar@gmail.com>
Date: Thu Aug 5 08:18:30 2021 -0400
HTML Tidy now parses HTML non-recursively.
Instead of recursive calls for each nested level of HTML, the next level is
pushed to a stack on the heap, and returned to the main loop. This prevents
stack overflow at _n_ depth (where _n_ is operating-system dependent). It's
probably still possible to use all of the heap memory, but Tidy's allocators
already fail gracefully in this circumstance.
Please report any regressions of your own HTML!
NOTE: the XML parser is not affected, and is probably still highly recursive.
regression_testing/cases/dev-cases/case-001.conf | 4 +
regression_testing/cases/dev-cases/case-001@0.html | 26 +
regression_testing/cases/dev-cases/case-002.conf | 4 +
regression_testing/cases/dev-cases/case-002@1.html | 33 +
regression_testing/cases/dev-cases/case-003.conf | 4 +
regression_testing/cases/dev-cases/case-003@1.html | 27 +
regression_testing/cases/dev-cases/case-004.conf | 4 +
regression_testing/cases/dev-cases/case-004@1.html | 41 +
regression_testing/cases/dev-expects/case-001.html | 41 +
regression_testing/cases/dev-expects/case-001.txt | 14 +
regression_testing/cases/dev-expects/case-002.html | 39 +
regression_testing/cases/dev-expects/case-002.txt | 16 +
regression_testing/cases/dev-expects/case-003.html | 30 +
regression_testing/cases/dev-expects/case-003.txt | 26 +
regression_testing/cases/dev-expects/case-004.html | 61 +
regression_testing/cases/dev-expects/case-004.txt | 14 +
regression_testing/cases/special-cases/README.txt | 15 +
.../cases/special-cases/case-evil.conf | 4 +
.../cases/special-cases/case-evil@1.html | 6 +
src/parser.c | 7482 +++++++++-----------
src/parser.h | 33 +-
src/tags.h | 2 +-
22 files changed, 3890 insertions(+), 4036 deletions(-)
create mode 100755 regression_testing/cases/dev-cases/case-001.conf
create mode 100755 regression_testing/cases/dev-cases/case-001@0.html
create mode 100755 regression_testing/cases/dev-cases/case-002.conf
create mode 100755 regression_testing/cases/dev-cases/case-002@1.html
create mode 100755 regression_testing/cases/dev-cases/case-003.conf
create mode 100644 regression_testing/cases/dev-cases/case-003@1.html
create mode 100755 regression_testing/cases/dev-cases/case-004.conf
create mode 100644 regression_testing/cases/dev-cases/case-004@1.html
create mode 100644 regression_testing/cases/dev-expects/case-001.html
create mode 100644 regression_testing/cases/dev-expects/case-001.txt
create mode 100644 regression_testing/cases/dev-expects/case-002.html
create mode 100644 regression_testing/cases/dev-expects/case-002.txt
create mode 100644 regression_testing/cases/dev-expects/case-003.html
create mode 100644 regression_testing/cases/dev-expects/case-003.txt
create mode 100644 regression_testing/cases/dev-expects/case-004.html
create mode 100644 regression_testing/cases/dev-expects/case-004.txt
create mode 100644 regression_testing/cases/special-cases/README.txt
create mode 100755 regression_testing/cases/special-cases/case-evil.conf
create mode 100644 regression_testing/cases/special-cases/case-evil@1.html
Bisect Rest (1)
91f29ea * bad @ HTML Tidy now parses HTML non-recursively.
Bisect Log (9)
git bisect start 'next' 'bed8efb'
d08ddc2 bad Bump version. No binary change, but does affect environment.
bed8efb good Bump to 5.7.54 based on settings fix.
git bisect good db847e6e1c632c7bf361f7d82daf6736fa43b246
db847e6 good Merge pull request #981 from htacg/iterate
git bisect bad a46949f46a4cc32ed23303d456ad9c20beac3866
a46949f bad Bump to version 5.9.12.
git bisect good c22c37b5a473d4a4b0bbd23cb3051f820b3ff026
c22c37b good Add license to .github
git bisect bad 28068b1273c85d2a4b7c9441530b32d71951b24e
28068b1 bad Fixes #816.
git bisect good b6f7e4384295dd28a3eb1edcd5ee3bed23f08ea5
b6f7e43 good Merge pull request #984 from htacg/issue_946
git bisect bad 2e7ec117fdd3ed5c20e9e92ff4b282239bb7bdcd
2e7ec11 bad Bump version.
git bisect bad 91f29ea7b88a0f3a810d011f958ea9dd935bd65b
91f29ea bad HTML Tidy now parses HTML non-recursively.
91f29ea7b88a0f3a810d011f958ea9dd935bd65b is the first bad commit
Untracked files (1)
.ccls-cache/
Recent commits
91f29ea bad @ HTML Tidy now parses HTML non-recursively.
b6f7e43 good-b6f7e4384295dd28a3eb1edcd5ee3bed23f08ea5 5.9.8-next Merge pull request #984 from htacg/issue_946
efa6152 Fixes #946 by refactoring the recursion into a loop with a heap-based stack.
c055b71 Deleted LICENSE again. Enough is enough.
c22c37b good-c22c37b5a473d4a4b0bbd23cb3051f820b3ff026 Add license to .github
e11dba9 Removed docs.
995c20e Doc folder.
1213047 More static analyser fixes; version bump to 5.9.7.
5f98ccd Static analyzer fixes.
bd751a8 Fix allocation error; fix some static analyzer suggestions.
src/parser.c | 7482 +++++++++-----------
I am afraid of the contents of the
It is hard to read this.
+1
Newest HTML Tidy versions remove line breaks inside <pre>
.
I can’t find, how I can prevent this behavior.
HTML-Tidy transform blocks of code like this:
To such:
For languages and markups like YAML, where indentation is required, HTML-Tidy transforms the code into invalid:
KiraTidyPygments.html
:
<!doctype html>
<html lang="en">
<head>
<meta charset="utf-8">
<meta name="viewport" content="width=device-width, initial-scale=1">
<title>Pygments code block MCVE</title>
</head>
<body>
<!-- [INFO] This code is automatically generated by the “SuperFences” extension for Python Markdown:
https://facelessuser.github.io/pymdown-extensions/extensions/superfences/#code-highlighting
From:
```yaml
kira:
goddess: true
-->
<div class="SashaBlockHighlight"><pre><span></span><code><span class="nt">kira</span><span class="p">:</span><span class="w"></span>
<span class="err"> </span><span class="nt">goddess</span><span class="p">:</span><span class="w"> </span><span class="l l-Scalar l-Scalar-Plain">true</span><span class="w"></span>
</code></pre></div>
</body>
Tidy removes the line breaks from the contents of a pre element when that element contains another element such as the code element. I expected tidy to preserve the line breaks inside the pre element (except perhaps a line break that immediately follows the