Closed terryjreedy closed 8 months ago
Sphinx 2.? generates different html than 1.8 such that the display of
Help ==> IDLE Help has extra blank lines. Among possibly other things, the contents of \<li>...\</li> is wrapped in \<p>...\</p> and blank lines appear between the bullet and text.
\<ul class="simple"> -\<li>coded in 100% pure Python, using the \<a class="reference internal" href="tkinter.html#module-tkinter" title="tkinter: Interface to Tcl/Tk for graphical user interfaces">\<code class="xref py py-mod docutils literal notranslate">\<span class="pre">tkinter\</span>\</code>\</a> GUI toolkit\</li> -\<li>cross-platform: works mostly the same on Windows, Unix, and macOS\</li> ... +\<li>\<p>coded in 100% pure Python, using the \<a class="reference internal" href="tkinter.html#module-tkinter" title="tkinter: Interface to Tcl/Tk for graphical user interfaces">\<code class="xref py py-mod docutils literal notranslate">\<span class="pre">tkinter\</span>\</code>\</a> GUI toolkit\</p>\</li> +\<li>\<p>cross-platform: works mostly the same on Windows, Unix, and macOS\</p>\</li> ... \</ul>
A similar issue afflicts the menu, with blank lines between the menu item and the explanation.
The html original 3x/Doc/build/html/library/idle.html#index-0 looks normal in Firefox. The html parser class in help.py needs to ignore \<p> within \<li>. It should specify which version of Sphinx it is compatible with.
Do any of you have any idea what the html change might be about? Is there something wrong with idle.rst?
tl;dr I think it's a difference in the CSS for the HTML5 writer.
----------------------------------------
In the HTMLTranslator class for docutils writer [1], I found the following docstring, specifically the line "The html5_polyglot writer solves this using CSS2.".
"""
The html4css1 writer has been optimized to produce visually compact
lists (less vertical whitespace). HTML's mixed content models
allow list items to contain "<li><p>body elements</p></li>" or
"<li>just text</li>" or even "<li>text<p>and body
elements</p>combined</li>", each with different effects. It would
be best to stick with strict body elements in list items, but they
affect vertical spacing in older browsers (although they really
shouldn't).
The html5_polyglot writer solves this using CSS2.
Here is an outline of the optimization:
- Check for and omit <p> tags in "simple" lists: list items
contain either a single paragraph, a nested simple list, or a
paragraph followed by a nested simple list. This means that
this list can be compact:
- Item 1.
- Item 2.
But this list cannot be compact:
- Item 1.
This second paragraph forces space between list items.
- Item 2.
- In non-list contexts, omit <p> tags on a paragraph if that
paragraph is the only child of its parent (footnotes & citations
are allowed a label first).
- Regardless of the above, in definitions, table cells, field bodies,
option descriptions, and list items, mark the first child with
'class="first"' and the last child with 'class="last"'. The stylesheet
sets the margins (top & bottom respectively) to 0 for these elements.
The ``no_compact_lists`` setting (``--no-compact-lists`` command-line
option) disables list whitespace optimization.
"""
In the HTMLTranslator class for the base [2], I found this comment: # Do not omit \<p> tags # -------------------- #
# visually compact lists (less vertical whitespace)". This writer
# relies on CSS rules for"visual compactness".
#
# * In XHTML 1.1, e.g. a <blockquote> element may not contain
# character data, so you cannot drop the <p> tags.
# * Keeping simple paragraphs in the field_body enables a CSS
# rule to start the field-body on a new line if the label is too long
# * it makes the code simpler.
Since both comments are a few years old, I think it's in the CSS.
[1] https://sourceforge.net/p/docutils/code/HEAD/tree/trunk/docutils/docutils/writers/html4css1/__init__.py [2] https://sourceforge.net/p/docutils/code/HEAD/tree/trunk/docutils/docutils/writers/_html_base.py
Adding on to my last post, it's not in the CSS, but it's that Sphinx 2.0 switches from a default of HTML4 to HTML5. The docutils comments explain the difference between the two.
https://github.com/sphinx-doc/sphinx/commit/a3cdd465ecf018fa5213b6b2c1c4e495973a2896
Thank you for the research, including the crucial commit! What I understand from the quotes:
Sphinx 2 writes HTML5 by default. The html5 writers always writes paragraphs because they are required by the xhtml used by html5.
Firefox, for instance, displays the result the same as before either because it either has the logic to avoid extra blank lines when reading html5 or because this is taken care of by revised css (this is unclear from the quotes).
To deal with html5, our converter would have to ignore the \<p>s that the html4 writer omitted, by adding logic for the cases used in idle.rst. Not fun.
Reading the commit (3rd line) revealed a new sphinx configuration option: html4_writer, defaulting to False. When I switched from building html with my 3.6 install with sphinx 1.8.1 to 3.7 with 2.something, and added "-D html4_writer=1" to a direct call of sphinx-build, I indeed got html without added \<p>s. The only different was the irrelevant omission of '\n' between list item header and text in the html file. Example: -\<dt>New File\</dt> -\<dd>Create a new file editing window.\</dd> +\<dt>New File\</dt>\<dd>Create a new file editing window.\</dd>
Setting SPHINXOPTS should work when using 'Doc/make.bat html'. I will prepare a PR documenting our parser requirement and include the neutral html changes.
The blank lines between list bullets and text and between menu items and explanations was fixed on a PR not linked to this issue.
The single-spaced list at the top of the file begins with <ul class="simple">
. The double-spaced main list in Key bindings
begins with just <ul>
, where as the nested list again includes class = "simple"
. The same difference appears at the end of Shell window
Our formatter double spaces the non-simple list. I think the consistently single-spaced Firefox list format is better.
The PR stop double spacing non-simple lists when displayed by Help => IDLE Doc and will close this issue. (Edge browser also double spaces such lists but will not be affected. A separate PR will revise the lists to make clearer and make all simple.
Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.
Show more details
GitHub fields: ```python assignee = 'https://github.com/terryjreedy' closed_at = None created_at =
labels = ['3.8', 'expert-IDLE', 'type-bug', '3.7', '3.9']
title = 'IDLE: Revise html to tkinker converter for help.html'
updated_at =
user = 'https://github.com/terryjreedy'
```
bugs.python.org fields:
```python
activity =
actor = 'terry.reedy'
assignee = 'terry.reedy'
closed = False
closed_date = None
closer = None
components = ['IDLE']
creation =
creator = 'terry.reedy'
dependencies = []
files = []
hgrepos = []
issue_num = 37298
keywords = []
message_count = 4.0
messages = ['345722', '346205', '346206', '346241']
nosy_count = 4.0
nosy_names = ['terry.reedy', 'markroseman', 'mdk', 'cheryl.sabella']
pr_nums = []
priority = 'normal'
resolution = None
stage = 'needs patch'
status = 'open'
superseder = None
type = 'behavior'
url = 'https://bugs.python.org/issue37298'
versions = ['Python 3.7', 'Python 3.8', 'Python 3.9']
```
Linked PRs