readthedocs / sphinx_rtd_theme

Sphinx theme from Read the Docs
https://sphinx-rtd-theme.readthedocs.io/
MIT License
4.78k stars 1.74k forks source link

Question mark instead of apostrophe #1209

Closed PrimozGodec closed 3 years ago

PrimozGodec commented 3 years ago

Problem

We are switching from Alabaster to Sphinx_rtd_theme. We noticed that all apostrophes in the documentation are now question marks (see the image below). I am not sure if it is an issue for this repository but I guess it has something to do with the theme since it was not the case with Alabaster before. Changes and config are available here https://github.com/biolab/orange3-text/pull/695. Documentation was build with make htmlhelp command. Any advice on what to check/how debug is welcome.

Screenshot 2021-08-18 at 16 36 11

Reproducible Project

https://github.com/biolab/orange3-text/pull/695

Error Logs/Results

No error in the log. It is a building log:

sphinx-build -b htmlhelp -d _build/doctrees   . _build/htmlhelp
Running Sphinx v4.0.2
loading pickled environment... done
building [mo]: targets for 0 po files that are out of date
building [htmlhelp]: targets for 0 source files that are out of date
updating environment: 0 added, 0 changed, 0 removed
looking for now-outdated files... none found
no targets are out of date.
build succeeded.

You can now run HTML Help Workshop with the .htp file in _build/htmlhelp.

Build finished; now you can run HTML Help Workshop with the .hhp project file in _build/htmlhelp.

Expected Results

Apostrophes should be parsed as apostrophes.

Environment Info

agjohnson commented 3 years ago

This looks like a missing font codepoint. The character is not basic apostrophe, but is instead some other unicode apostrophe. You can try replacing with a basic apostrophe. The font used here has fairly good unicode support, but it's not surprising that there is not a codepoint here, there are a lot of apostrophe permutations.

PrimozGodec commented 3 years ago

@agjohnson thank you for your response. I checked the apostrophe and it is a basic apostrophe with the code 0x27.

I am not sure if it is really a font issue since when I open the compiled HTML there is already � (replacement) character compiled in the code. So the change seemed to happen in the compilation of the documentation.

I also tried to retype apostrophe characters but they are still compiling as �.

agjohnson commented 3 years ago

Ah, I missed that this is the htmlhelp builder. This maybe isn't directly a font issue, but the codepoint is incredibly wrong. The sphinx htmlhelp builder does weird things with escaping characters for the CHM/etc format, which I don't quite understand. I was easily able to replicate on our docs using the htmlhelp builder though.

In the raw html output, a simple apostrophe in source was encoded as 0x92 for both themes.

I can't explain why output with this theme is different that Alabaster though. Even switching the font to the same font as Alabaster -- system default DejaVu Serif for me -- doesn't change the display here, it's still a missing codepoint.

The theme doesn't do anything special here, this text is all generated at the Sphinx level.

I doubt we have ever tested the htmlhelp format in the past, so safe to say we don't really support the format -- not directly at least. In fact, afaik, htmlhelp requires html4 output, which we're dropping support for in an upcoming release. If you come up with any more clues as to why this is happening, or a potential fix, we'd certainly accept the bug fix though.

PrimozGodec commented 3 years ago

Thank you for your concrete reply. I found out that we have no longer need to use htmlhelp builder so I just moved html builder. Everything is working correctly with html builder. I am guessing that it has something to do with htmlhelp smart quoting.