Open fireattack opened 5 months ago
WTF?
Can't be broken for long, that's for sure.
Pretty sure this was working the day before this issue got opened. Probably an internal update to GitHub's rst-to-html renderer which seems to have made it at least twice as slow and it is now hitting a timeout.
This might be the reason.. https://github.com/mikf/gallery-dl/blob/master/docs/configuration.rst is definitely a lot slower than it was before.
~~Here is the output from rst2html
on the first "free static HTML" host I found:
https://gdl.tiiny.site~~
edit: hosted on GitHub: https://gdl-org.github.io/docs/configuration.html
For the time being, maybe the configuration docs link in mikf/gallery-dl/README.rst
should be changed, until GitHub (or god knows who) fixes this issue?
Is it just me, or is it working again?
Is it just me, or is it working again?
It looks they have made some effort to fix it lately, but it is currently still half broken for most complex ones (more info in https://github.com/orgs/community/discussions/86715).
In our case, it's still broken after https://github.com/mikf/gallery-dl/blob/master/docs/configuration.rst#extractorcategory-transfer
Now even simpler .rst
files like the README don't render properly anymore.
The table of contents is missing and all text in .. code::
sections is underlined.
I think Github's render looks better and more convenient.
However, rewriting to Markdown will take some hours for monotonous boring actions. Will be accepted a PR if someone will rewrite it? Or does RST have some benefits over MD?
UPD.
Technically, for such simple thing (rewrite rst to markdown) it's possible to use "a modern chat bot". It should work adequately. ~However, the input is too big I think.~ However, there is a problem with too large output. The input as a link seems to be OK.
rewrite rst file from this link
https://raw.githubusercontent.com/mikf/gallery-dl/master/docs/configuration.rst
to a markdown
Anyone can try this?
What do you mean, rewrite? Why would you want to use a chatbot for this?
We have https://github.com/jgm/pandoc
What do you mean, rewrite?
Convert .rst to .md. Manual converting is a rewriting.
We have https://github.com/jgm/pandoc
It also produces the wrong result.
$ ./pandoc.exe -o configuration.md configuration.rst
[WARNING] Reference not found for 'extractor.*.path-replace' at configuration.rst_chunk_chunk line 1 column 102
[WARNING] Reference not found for 'path-restrict' at configuration.rst line 246 column 47
[WARNING] Reference not found for 'extractor.*.archive' at configuration.rst_chunk line 3 column 69
[WARNING] Reference not found for 'retries' at configuration.rst line 378 column 52
[WARNING] Reference not found for 'extractor.*.cookies' at configuration.rst_chunk_chunk line 1 column 47
[WARNING] Reference not found for 'keywords' at configuration.rst line 702 column 51
[WARNING] Reference not found for 'extractor.*.filename' at configuration.rst_chunk line 6 column 62
[WARNING] Reference not found for 'tags' at configuration.rst line 4452 column 32
I don't know. This is the best markup format converter that I've heard of.
Maybe there's something not properly "standard" in the RST?
❌ →
✅ →
$ ./pandoc.exe -o configuration.md configuration.rst --from rst --to gfm
[WARNING] Reference not found for 'extractor.*.path-replace' at configuration.rst_chunk_chunk line 1 column 102
[WARNING] Reference not found for 'path-restrict' at configuration.rst line 247 column 47
[WARNING] Reference not found for 'extractor.*.archive' at configuration.rst_chunk line 3 column 69
[WARNING] Reference not found for 'retries' at configuration.rst line 379 column 52
[WARNING] Reference not found for 'extractor.*.cookies' at configuration.rst_chunk_chunk line 1 column 47
[WARNING] Reference not found for 'keywords' at configuration.rst line 703 column 51
[WARNING] Reference not found for 'tags' at configuration.rst line 4453 column 32
These warnings are about the wrong count of underscores (_
).
Note that I also use --from rst --to gfm
.
https://pandoc.org/MANUAL.html#markdown-variants (gfm
(Github-Flavored Markdown))
One more. The most important one.
❌❌❌ →
✅✅✅ →
Also, now more underscore bugs to fix:
$ ./pandoc.exe -o configuration.md configuration.rst --from rst --to gfm
[WARNING] Reference not found for 'extractor.*.path-replace' at configuration.rst_chunk_chunk line 1 column 102
[WARNING] Reference not found for 'path-restrict' at configuration.rst line 247 column 47
[WARNING] Reference not found for 'extractor.*.archive' at configuration.rst_chunk line 3 column 69
[WARNING] Reference not found for 'retries' at configuration.rst line 379 column 52
[WARNING] Reference not found for 'extractor.*.cookies' at configuration.rst_chunk_chunk line 1 column 47
[WARNING] Reference not found for 'keywords' at configuration.rst line 703 column 51
[WARNING] Reference not found for 'skip' at configuration.rst line 845 column 32
[WARNING] Reference not found for 'archive-format' at configuration.rst line 876 column 69
[WARNING] Reference not found for 'extractor.*.archive' at configuration.rst_chunk line 4 column 50
[WARNING] Reference not found for 'image-range' at configuration.rst line 1138 column 48
[WARNING] Reference not found for 'extractor.*.image-filter' at configuration.rst_chunk line 1 column 50
[WARNING] Reference not found for 'image-unique' at configuration.rst line 1189 column 50
[WARNING] Reference not found for 'extractor.*.image-filter' at configuration.rst_chunk line 11 column 31
[WARNING] Reference not found for 'extractor.*.cookies' at configuration.rst_chunk line 4 column 38
[WARNING] Reference not found for 'extractor.*.verify' at configuration.rst_chunk line 1 column 22
[WARNING] Reference not found for 'extractor.*.proxy' at configuration.rst_chunk line 1 column 21
[WARNING] Reference not found for 'extractor.*.skip' at configuration.rst_chunk_chunk line 2 column 56
[WARNING] Reference not found for 'extractor.*.archive' at configuration.rst_chunk line 2 column 34
[WARNING] Reference not found for 'extractor.*.archive' at configuration.rst_chunk line 2 column 34
[WARNING] Reference not found for 'extractor.*.archive' at configuration.rst_chunk line 2 column 34
[WARNING] Reference not found for 'extractor.*.directory' at configuration.rst_chunk line 4 column 48
[WARNING] Reference not found for 'extractor.*.image-filter' at configuration.rst_chunk line 5 column 44
[WARNING] Reference not found for 'extractor.*.skip' at configuration.rst_chunk_chunk_chunk line 1 column 66
One second fix (with Notepad++):
The result: https://gist.github.com/AlttiRi/20e5442961f800f8d0f3f1d05d0535e2
No warnings, btw.
But some underscores anyway should be deleted:
❌ →
✅ →
$ ./pandoc.exe -o configuration.md configuration.rst --from rst --to gfm [WARNING] Reference not found for 'extractor.*.path-replace' at configuration.rst_chunk_chunk line 1 column 102 [WARNING] Reference not found for 'path-restrict' at configuration.rst line 247 column 47 [WARNING] Reference not found for 'extractor.*.archive' at configuration.rst_chunk line 3 column 69 [WARNING] Reference not found for 'retries' at configuration.rst line 379 column 52 [WARNING] Reference not found for 'extractor.*.cookies' at configuration.rst_chunk_chunk line 1 column 47 [WARNING] Reference not found for 'keywords' at configuration.rst line 703 column 51 [WARNING] Reference not found for 'tags' at configuration.rst line 4453 column 32
These warnings are about the wrong count of underscores (
_
).Note that I also use
--from rst --to gfm
. https://pandoc.org/MANUAL.html#markdown-variants (gfm
(Github-Flavored Markdown))
This exact error of "wrong rendering caused by no blank line after headings, when title has certain specific sequence of characters including asterisk" seems to be what GitHub currently has, too. I guess they may have switched from using sphinx to pandoc at some point?
And our current syntax isn't really wrong; just GitHub /pandoc's rendering implementation isn't up to spec.
According to https://docutils.sourceforge.io/docs/ref/rst/restructuredtext.html#toc-entry-10:
A blank line after a title is optional. All text blocks up to the next title of the same or higher level are included in a section (or subsection, etc.).
And here is the rules about when would asterisk would be recognized as literal instead of markup: https://docutils.sourceforge.io/docs/ref/rst/restructuredtext.html#inline-markup-recognition-rules
Where according to rule 6 & 7 (among others), our case of extractor.*.xxx
should not be recognized as markup.
Previous discussion: https://github.com/orgs/community/discussions/86715#discussioncomment-9149986
Edit: obviously, if we can fix it by just adding blank lines, we can just workaround it as you suggested. I also made a ticket at pandoc repo.
As suggested, I added a newline after every option name heading in my local copy and wanted to use GitHub's edit and preview feature to see if it actually makes a difference, only to realize that docs/configfuration.rst now renders perfectly fine even without these changes.
… Unless I'm missing something, but it looks fine as far as I can tell. The Pandoc issue got fixed as well, but it hasn't been released yet, I don't think.
Maybe they're using dev main
version of pandoc (or at least cherry picking important commits in their downstream)? Assuming they're indeed using pandoc, of course.
Not sure how long it has been broken, but currently https://github.com/mikf/gallery-dl/blob/master/docs/configuration.rst does not render, makes reading it hard. I think it's a GitHub issue, since I see ppl reporting similar.
Maybe we could provide a static rendered HTML version somewhere in the mean time?