Closed Netspider closed 8 years ago
Not our bug. Please make sure you’re running an up-to-date version of rst2html5 (pip install -U rst2html5
), delete the cache/
file for this post, and if the issue persists, report a bug upstream.
Turns out we can (and should) actually fix this, which will happen in a minute or two.
Hi, I'm the author and mantainer of rst2html5 (https://pypi.io/project/rst2html5/). The project is hosted at https://bitbucket.org/andre_felipe_dias/rst2html5 and I'd like to help to fix this issue.
I couldn't replicate the issue. It looks like it is resolved already.
@andredias I worked around it, but partially:
We’re using a custom code block — it works fine with docutils, but with rst2html5 [(plugin)](, there are indentation issues: getnikola/plugins#152
Unfortunately my patch does not cut it, because that breaks tabs (U+0009).
Here’s a gist with a sample post and differences: https://gist.github.com/Kwpolska/a2d4268a08b60df895180f5aa6fd5513
I have tracked the issue down to a difference in self.content
for our CodeBlock directive:
. render_posts:cache/posts/index4.html
N. ['sudo vim /etc/lightdm/lightdm-gtk-greeter.conf']
N. ['[base]', 'session=/usr/bin/startlxde', '...', '[userlist]', 'disable=1']
N. ['foo', 'bar', ' baz (this indented with 3 spaces and 2 tabs)', 'foobar', ' foobaz']
. render_posts:cache/posts/index.html
N. ['sudo vim /etc/lightdm/lightdm-gtk-greeter.conf']
N. ['[base]', 'session=/usr/bin/startlxde', '...', '[userlist]', 'disable=1']
N. ['foo', 'bar', 'baz (this indented with 3 spaces and 2 tabs)', 'foobar', ' foobaz']
As you can see, we’re missing the indentation of baz
. Is this a bug on our side or is this an issue with how rst2html5 parses things?
Neither rst2html.py
nor rst2html5
uses tabs in pre
(code) output. Both replaces tabs with a certain number of spaces defined by in tab-width
directive. Actually, the parser does that, not the writers, so there is nothing one can do about that. That said, using 0 for tab-width
breaks tabs as you noticed. I suggest turning it back to 4 and looking for a solution elsewhere.
Since the snippet given earlier by @Netspider works fine with both rst2html.py
and rst2html5
, it looks like the bug could be in Nikola's code-block
directive. Does it mixes tabs and spaces somehow? Mixing tabs and spaces is as bad in restructuredText as it is in Python.
I'd like to point out that rst2html5
has its own code-block
directive since v1.7. Maybe it would be better use that one.
Mixing tabs and spaces is as bad in restructuredText as it is in Python.
Well, that’s a great reason to say our current thing is good enough.
That said, using 0 for
tab-width
breaks tabs as you noticed. I suggest turning it back to 4 and looking for a solution elsewhere.
Okay, what other solution should we use? The root of the problem is rst2html5 outputs everything indented by 4 spaces by default.
rst2html5's output indentation is intended to produce a html5 code that is easier to read. You could use the option --no-indent
if you like, but it has nothing to do with code-blocks.
Let me rephrase some of my previous findings: when you use the snippet directly with rst2html.py (docutils) or rst2html5, the output is correct:
$ echo '.. code:: bash
# this is at the beginning of the line
# this is indented by 4 spaces
# and this line is also indented by 4 spaces' > code.rst
$ rst2html.py code.rst
$ rst2html5 code.rst
The rst2html.py output is:
...
<pre class="code bash literal-block">
<span class="comment single"># this is at the beginning of the line
# this is indented by 4 spaces
# and this line is also indented by 4 spaces</span>
</pre>
The rst2html5 output is:
<!DOCTYPE html>
<html>
<head>
<meta charset="utf-8" />
</head>
<body>
<pre class="code bash"><span class="c"># this is at the beginning of the line
# this is indented by 4 spaces
# and this line is also indented by 4 spaces</span></pre>
</body>
</html>
Actually, it should have been <pre data-language="bash">...
but indentation is correct. I'll take at this right now.
Note that since outputs are both correct, the problem certainly is elsewhere.
I'm not blaming anyone. Besides I want to make it clear that I do use Nikola at my own blog (https://blog.pronus.io) and I have a great interest to make rst2html5 work fine with Nikola. That said, my further line of investigation will be:
Without Nikola code blocks (rest
plugin unreachable): standard rst2html5 behavior
<p>Write your post here.</p>
<section id="header">
<h1>Header</h1>
<p>Foobar.</p>
<pre class="code bash"><span class="c"># this is at the beginning of the line
# this is indented by 4 spaces
# and this line is also indented by 4 spaces</span></pre>
<p>Another.</p>
<pre class="code bash"><span class="c"># this is at the beginning of the line
# this is indented by 4 spaces
# and this line is also indented by 4 spaces</span></pre>
</section>
With Nikola code blocks: all but the first line have extra indent
<p>Write your post here.</p>
<section id="header">
<h1>Header</h1>
<p>Foobar.</p>
<pre class="code bash"><a name="rest_code_fa1f227a0edc4c64a17b7aeb16be17c8-1"></a><span class="c"># this is at the beginning of the line</span>
<a name="rest_code_fa1f227a0edc4c64a17b7aeb16be17c8-2"></a><span class="c"># this is indented by 4 spaces</span>
<a name="rest_code_fa1f227a0edc4c64a17b7aeb16be17c8-3"></a><span class="c"># and this line is also indented by 4 spaces</span>
</pre>
<p>Another.</p>
<pre class="code bash"><a name="rest_code_4ada58ebed99409b946b06d8f2a91ec4-1"></a><span class="c"># this is at the beginning of the line</span>
<a name="rest_code_4ada58ebed99409b946b06d8f2a91ec4-2"></a><span class="c"># this is indented by 4 spaces</span>
<a name="rest_code_4ada58ebed99409b946b06d8f2a91ec4-3"></a><span class="c"># and this line is also indented by 4 spaces</span>
</pre>
</section>
With --no-indent
equivalent: all code appears on one line
<p>Write your post here.</p><section id="header"><h1>Header</h1><p>Foobar.</p><pre class="code bash"><a name="rest_code_1fd4ac2446d34833b676a164575b852e-1"></a><span class="c"># this is at the beginning of the line</span><a name="rest_code_1fd4ac2446d34833b676a164575b852e-2"></a><span class="c"># this is indented by 4 spaces</span><a name="rest_code_1fd4ac2446d34833b676a164575b852e-3"></a><span class="c"># and this line is also indented by 4 spaces</span></pre><p>Another.</p><pre class="code bash"><a name="rest_code_e5c59655af9a4570b654f04ce7c5b2ae-1"></a><span class="c"># this is at the beginning of the line</span><a name="rest_code_e5c59655af9a4570b654f04ce7c5b2ae-2"></a><span class="c"># this is indented by 4 spaces</span><a name="rest_code_e5c59655af9a4570b654f04ce7c5b2ae-3"></a><span class="c"># and this line is also indented by 4 spaces</span></pre></section>
With tab width set to 0: works (we currently use that)
<p>Write your post here.</p>
<section id="header">
<h1>Header</h1>
<p>Foobar.</p>
<pre class="code bash"><a name="rest_code_94a2c8897e5146ba8465c86862a81671-1"></a><span class="c"># this is at the beginning of the line</span>
<a name="rest_code_94a2c8897e5146ba8465c86862a81671-2"></a><span class="c"># this is indented by 4 spaces</span>
<a name="rest_code_94a2c8897e5146ba8465c86862a81671-3"></a><span class="c"># and this line is also indented by 4 spaces</span>
</pre>
<p>Another.</p>
<pre class="code bash"><a name="rest_code_da3a08f82f2b45028b3468c450e043ff-1"></a><span class="c"># this is at the beginning of the line</span>
<a name="rest_code_da3a08f82f2b45028b3468c450e043ff-2"></a><span class="c"># this is indented by 4 spaces</span>
<a name="rest_code_da3a08f82f2b45028b3468c450e043ff-3"></a><span class="c"># and this line is also indented by 4 spaces</span>
</pre>
</section>
ok. Now I get it. The problem is that Nikola's CodeBlock directive returns a raw
node with html text content (https://github.com/getnikola/nikola/blob/master/nikola/plugins/compile/rest/listing.py#L113). This is not correct, sorry. It should have returned a literal_block
node instead as sphinx does (https://github.com/sphinx-doc/sphinx/blob/master/sphinx/directives/code.py#L119). I've copied their code-block directive in rst2html5. I suggest you to do the same or just disable Nikola's code-block directive for rst2html5.
Okay, would you mind patching our listing directive to work with literal_block then? Because if I just do it the naïve way, it breaks:
<p>Write your post here.</p>
<section id="header">
<h1>Header</h1>
<p>Foobar.</p>
<pre><pre class="code bash"><a name="rest_code_c55578bfc35343ed9f2d5d33218ae9fa-1"></a><span class="c"># this is at the beginning of the line</span>
<a name="rest_code_c55578bfc35343ed9f2d5d33218ae9fa-2"></a><span class="c"># this is indented by 4 spaces</span>
<a name="rest_code_c55578bfc35343ed9f2d5d33218ae9fa-3"></a><span class="c"># and this line is also indented by 4 spaces</span>
</pre></pre>
<p>Another.</p>
<pre><pre class="code bash"><a name="rest_code_a483f8a784754812910742f67fe9ab36-1"></a><span class="c"># this is at the beginning of the line</span>
<a name="rest_code_a483f8a784754812910742f67fe9ab36-2"></a><span class="c"># this is indented by 4 spaces</span>
<a name="rest_code_a483f8a784754812910742f67fe9ab36-3"></a><span class="c"># and this line is also indented by 4 spaces</span>
</pre></pre>
</section>
I'll try.
The original Nikola's code-block directive proved to work just fine for rst2html. Why don't keep it this way for rst2html and let rst2html5 use its own code-block directive? Thus, instead of changing the directive, we would change the time of the registration of the directives so one doesn't cover up another.
studying the matter further, I realize that there won't be a single code-block directive that will fit both rst2html and rst2html5 at the same time because their different ways of processing.
We’d like to keep support for listings and linking to code lines in rst2html5, too. Especially since disabling (unnecessary) indents fixes this bug.
Good point. Please, give me a couple of days to work something out.
I've tried some alternatives but the simplest solution was rst2html5 not change keep raw html indentation. Please, revert your last commit 356b40d18e29 ("Use literal_block for Nikola code blocks") and update rst2html5 to version 1.8.1. It should work then.
Commit’s already gone, and things are fixed once and for all. All fixed in 78430c5.
after changing to the rest_html5 plugin I noticed wrong indentation for code blocks:
this snippet looks like this in the html4 rest parser:
with the rest_html5 parser it looks like this: