Closed eriksjolund closed 2 years ago
This is caused by the "smart" extensions in Sphinx (podman.io) and pandoc (windows remote docs). When it works the typography is nicer, but if it gets confused the result can be ugly or just plain wrong.
I'm wondering whether it would be best to just switch that feature off. We'd lose the following conversions:
"text goes here"
and ``text goes here''
into “text goes here”
(curly double quotes)'text goes here'
and `text goes here'
into ‘text goes here’
(curly single quotes)John's dog doesn't bark
into John’s dog doesn’t bark
(apostrophes)...
into …
(ellipsis)--
into –
(en dash)---
into —
(em dash)The manpage build doesn't use "smart", so everything in source/markdown is already readable and correct without it.
@eriksjolund can you give me a pointer to an example of a long dash on the site please?
@rcowsill - I'm fine with the changes that you pointed out except:
-- into – (en dash)
Will that change all of the man pages that you recently set to \-\-format
to an en dash?
@TomSweeneyRedHat Here is an example
$ curl -s https://docs.podman.io/en/latest/markdown/podman-pod-rm.1.html | grep $'\u2013'
<p>podman pod rm –pod-id-file /path/to/id/file</p>
$
Note that the first dash in –pod-id-file
(before "pod") is different from the dashes before "id" and "file".
$ curl -s https://docs.podman.io/en/latest/markdown/podman-pod-rm.1.html | grep $'\u2013' | od -c
0000000 < p > p o d m a n p o d r m
0000020 342 200 223 p o d - i d - f i l e
0000040 / p a t h / t o / i d / f i l e
0000060 < / p > \n
0000065
$
I also tried to visit the same web page https://docs.podman.io/en/latest/markdown/podman-pod-rm.1.html with Google Chrome. When I copy-paste from Google Chrome it looks like this:
podman pod rm –pod-id-file /path/to/id/file
Another example is here: http://docs.podman.io/en/latest/markdown/podman-save.1.html#compress
The heading correctly says --compress
, but the description refers to –format
instead of --format
. The problem isn't so much that it's the wrong type of dash, but the fact that there's only one dash instead of two.
To clarify my previous post, I was listing the conversions that are currently being done in the build process.
The en dash conversion is the one that causes this issue. I think it would be safest to switch that off so whenever someone types --
in the markdown it ends up as --
in the HTML version. Existing uses of \-\-
would still convert to --
in the HTML as they do at present.
Ah, light dawns on Marblehead here. Thanks for the pointers. I was looking at the options, not the text in the examples or explanatory text.
So as painful as it may be, could we fix this by changing the md files to \-\-
throughout, not just the options? I'm just a little leery about turning off an option that might bite us somewhere else.
Regardless, thanks to both of you for running this down.
I looked into replacing all the --
with \-\-
in the .mds, but hit a problem. Code blocks don't process escape sequences, so any --
inside a code block needs to be left as-is.
For example, here: http://docs.podman.io/en/latest/markdown/podman-stats.1.html#example. If we converted --
to \-\-
throughout the source, the long options in the command line would appear as (eg) \-\-no-stream
and the table output would be misaligned.
The same applies anywhere that code spans were used to inline a long option in a sentence, eg: https://docs.podman.io/en/latest/markdown/podman-create.1.html#ip-ip.
Taking this approach would leave a mix of --
and \-\-
in the source, which would make editing quite error-prone.
I made a draft PR with the config changes so people can take a look at the resulting HTML. Feel free to comment over there!
I noticed that http://docs.podman.io/en/v3.0.1/markdown/podman-run.1.html contains the text:
–log-driver=”driver”
It doesn't use two standard dashes.
@eriksjolund v3.0.1 was released before the original fix containers/podman#9856. It looks like all the double dashes in v3.0.1 are shown as en dashes, except those in a code/pre block.
@rcowsill Yes, you're right. I see that for instance that http://docs.podman.io/en/v3.1.0/markdown/podman-run.1.html has the correct type of dashes.
The following command downloaded 352 files.
$ wget -r -l inf https://docs.podman.io/en/latest/
I then searched recursively for EN DASH
$ grep --include '*.html' -r $'\u2013' .
./docs.podman.io/en/latest/markdown/podman-build.1.html:<li><p>Local directory – e.g. --build-context project2=../path/to/project2/src (This option is not available with the remote Podman client. On Podman machine setup (i.e macOS and Winows) path must exists on the machine VM)</p></li>
./docs.podman.io/en/latest/markdown/podman-build.1.html:<li><p>HTTP URL to a tarball – e.g. --build-context src=https://example.org/releases/src.tar</p></li>
./docs.podman.io/en/latest/markdown/podman-build.1.html:<li><p>Container image – specified with a container-image:// prefix, e.g. --build-context alpine=container-image://alpine:3.15, (also accepts docker://, docker-image://)</p></li>
$
Those seems to be legitimate. The problem seems to be gone. (At least under https://docs.podman.io/en/latest/)
This issue is related to https://github.com/containers/podman.io/issues/373
I noticed there are still some long command-line options that are not shown correctly. Instead of
--
an–
is shown.It seems the character is called
EN DASH
: https://charbase.com/2013-unicode-en-dashThe character
EN DASH
can be written with$'\u2013'
in a Bash shell.I'll use the
$'\u2013'
notation so that is clear what type of dash character is being used.Let's use
sed
to replace theEN DASH
character with the text string<ENDASH>
so that any findings will be more visible:I created an empty directory
and let the following command
run for a while to download manual pages. I pressed Ctrl-C to terminate it.
The
grep
command found some more examples of theEN DASH
character.