Debian / debiman

debiman generates a static manpage HTML repository out of a Debian archive
Apache License 2.0
188 stars 46 forks source link

Mismatched indents in mdoc pages using .Pp #116

Closed cjwatson closed 2 years ago

cjwatson commented 5 years ago

sshd_config(5) is rendered oddly: the first paragraph of DESCRIPTION should be at the same indentation level as the second, but it is not. You can see the same effect in each of the list elements that has multiple paragraphs under it.

The source for the first two paragraphs and the heading above them looks like this:

.Sh DESCRIPTION
.Xr sshd 8
reads configuration data from
.Pa /etc/ssh/sshd_config
(or the file specified with
.Fl f
on the command line).
The file contains keyword-argument pairs, one per line.
For each keyword, the first obtained value will be used.
Lines starting with
.Ql #
and empty lines are interpreted as comments.
Arguments may optionally be enclosed in double quotes
.Pq \&"
in order to represent arguments containing spaces.
.Pp
Note that the Debian
.Ic openssh-server
package sets several options as standard in
.Pa /etc/ssh/sshd_config
which are not the default in
.Xr sshd 8 :

I see that the first paragraph in the output is not enclosed in <p class="Pp"> (or indeed any other kind of <p>, while the second and subsequent ones are. This may be a bug in mandoc, since the documentation of .Pp in groff_mdoc(7) says: "The ‘.Pp’ paragraph command may be used to specify a line space where necessary. The macro is not necessary after a ‘.Sh’ or ‘.Ss’ macro ...". This seems to make it clear that it acts more as a separator than as something that introduces each paragraph, and so I'm confident that this is a bug in the renderer rather than in the page source.

Running mandoc -T html by hand shows a somewhat similar effect in terms of whether paragraph text is enclosed in <div class="Pp"> or not, but the output is sufficiently different that I'm not sure how much of this is customised by debiman.

stapelberg commented 5 years ago

When adding a <p class="Pp"> and </p> around the paragraph in question, it looks correct.

debiman doesn’t change much about the HTML, and doesn’t touch this part at all. The HTML structure is the same in mandoc -Thtml.

@ischwarze any thoughts on this issue? Thanks!

ischwarze commented 5 years ago

This issue has already been reported by Anthony Bentley some time ago, and i have started working on it. It is somewhat tricky to find out where exactly <p> elements have to be inserted, in particular because it also interacts will fill mode changes, so this needs more work than it might seem, but i'm confident it can be done.

ischwarze commented 5 years ago

In CVS HEAD, after some preparations, i committed changes to insert some implicit "p" elements as suggested in this ticket. The new code is installed on man.openbsd.org, so you can easily inspect the effect by looking at the HTML output of that site (or compile mandoc from CVS and run it by hand).

It may still not be perfect. For example, if there is multi-paragraph text in the body of a list item, the first paragraph still won't get a "p". Clearly, putting "p" inside every "li" and "dd" would be excessive because many list item bodies only contain single words, not whole paragraphs. Maybe a "p" should be inserted at the beginning of non-compact list item bodies that contain at least one explicit .Pp, or something similar. But i didn't do that yet such that we can first watch the dust settle, then maybe advance another step.

There may be more cases where additional automatic "p" elements might make sense inside specific flow containers.

Please tell me if you encounter such cases that matter in practice.

Apart from that, i think we are getting closer to the point where i will consider rolling the next mandoc release, maybe in a few weeks, after checking whether any reported bugs remain that can be fixed without too much risk and disruption.

stapelberg commented 2 years ago

Good news, everyone! mandoc 1.14.6 was released and is now deployed, so this bug is now fixed!