mikf / gallery-dl

Command-line program to download image galleries and collections from several image hosting sites
GNU General Public License v2.0
10.76k stars 885 forks source link

configuration.rst no longer renders on GitHub #5059

Open fireattack opened 5 months ago

fireattack commented 5 months ago

Not sure how long it has been broken, but currently https://github.com/mikf/gallery-dl/blob/master/docs/configuration.rst does not render, makes reading it hard. I think it's a GitHub issue, since I see ppl reporting similar.

Maybe we could provide a static rendered HTML version somewhere in the mean time?

Hrxn commented 5 months ago

WTF?

Can't be broken for long, that's for sure.

mikf commented 5 months ago

Pretty sure this was working the day before this issue got opened. Probably an internal update to GitHub's rst-to-html renderer which seems to have made it at least twice as slow and it is now hitting a timeout.

Hrxn commented 5 months ago

This might be the reason.. https://github.com/mikf/gallery-dl/blob/master/docs/configuration.rst is definitely a lot slower than it was before.

mikf commented 5 months ago

~~Here is the output from rst2html on the first "free static HTML" host I found: https://gdl.tiiny.site~~

edit: hosted on GitHub: https://gdl-org.github.io/docs/configuration.html

Hrxn commented 3 months ago

For the time being, maybe the configuration docs link in mikf/gallery-dl/README.rst should be changed, until GitHub (or god knows who) fixes this issue?

Hrxn commented 3 months ago

Is it just me, or is it working again?

fireattack commented 3 months ago

Is it just me, or is it working again?

It looks they have made some effort to fix it lately, but it is currently still half broken for most complex ones (more info in https://github.com/orgs/community/discussions/86715).

In our case, it's still broken after https://github.com/mikf/gallery-dl/blob/master/docs/configuration.rst#extractorcategory-transfer

mikf commented 3 months ago

Now even simpler .rst files like the README don't render properly anymore. The table of contents is missing and all text in .. code:: sections is underlined.

AlttiRi commented 1 month ago

https://gdl-org.github.io/docs/configuration.html

I think Github's render looks better and more convenient.

However, rewriting to Markdown will take some hours for monotonous boring actions. Will be accepted a PR if someone will rewrite it? Or does RST have some benefits over MD?


UPD.

Technically, for such simple thing (rewrite rst to markdown) it's possible to use "a modern chat bot". It should work adequately. ~However, the input is too big I think.~ However, there is a problem with too large output. The input as a link seems to be OK.

rewrite rst file from this link
https://raw.githubusercontent.com/mikf/gallery-dl/master/docs/configuration.rst
to a markdown

Anyone can try this?

Hrxn commented 1 month ago

What do you mean, rewrite? Why would you want to use a chatbot for this?

We have https://github.com/jgm/pandoc

AlttiRi commented 1 month ago

What do you mean, rewrite?

Convert .rst to .md. Manual converting is a rewriting.


We have https://github.com/jgm/pandoc

It also produces the wrong result.

$ ./pandoc.exe -o configuration.md configuration.rst
[WARNING] Reference not found for 'extractor.*.path-replace' at configuration.rst_chunk_chunk line 1 column 102
[WARNING] Reference not found for 'path-restrict' at configuration.rst line 246 column 47
[WARNING] Reference not found for 'extractor.*.archive' at configuration.rst_chunk line 3 column 69
[WARNING] Reference not found for 'retries' at configuration.rst line 378 column 52
[WARNING] Reference not found for 'extractor.*.cookies' at configuration.rst_chunk_chunk line 1 column 47
[WARNING] Reference not found for 'keywords' at configuration.rst line 702 column 51
[WARNING] Reference not found for 'extractor.*.filename' at configuration.rst_chunk line 6 column 62
[WARNING] Reference not found for 'tags' at configuration.rst line 4452 column 32

Here is: https://gist.github.com/AlttiRi/20e5442961f800f8d0f3f1d05d0535e2/253a2cac3f57f103889f326b9cf4499263e96ff2#extractorcategory-transfer

Hrxn commented 1 month ago

I don't know. This is the best markup format converter that I've heard of.

Maybe there's something not properly "standard" in the RST?

AlttiRi commented 1 month ago

imageimage

imageimage


$ ./pandoc.exe -o configuration.md configuration.rst --from rst --to gfm
[WARNING] Reference not found for 'extractor.*.path-replace' at configuration.rst_chunk_chunk line 1 column 102
[WARNING] Reference not found for 'path-restrict' at configuration.rst line 247 column 47
[WARNING] Reference not found for 'extractor.*.archive' at configuration.rst_chunk line 3 column 69
[WARNING] Reference not found for 'retries' at configuration.rst line 379 column 52
[WARNING] Reference not found for 'extractor.*.cookies' at configuration.rst_chunk_chunk line 1 column 47
[WARNING] Reference not found for 'keywords' at configuration.rst line 703 column 51
[WARNING] Reference not found for 'tags' at configuration.rst line 4453 column 32

These warnings are about the wrong count of underscores (_).


Note that I also use --from rst --to gfm. https://pandoc.org/MANUAL.html#markdown-variants (gfm (Github-Flavored Markdown))

AlttiRi commented 1 month ago

One more. The most important one.

❌❌❌ imageimage

✅✅✅ imageimage

Also, now more underscore bugs to fix:

$ ./pandoc.exe -o configuration.md configuration.rst --from rst --to gfm
[WARNING] Reference not found for 'extractor.*.path-replace' at configuration.rst_chunk_chunk line 1 column 102
[WARNING] Reference not found for 'path-restrict' at configuration.rst line 247 column 47
[WARNING] Reference not found for 'extractor.*.archive' at configuration.rst_chunk line 3 column 69
[WARNING] Reference not found for 'retries' at configuration.rst line 379 column 52
[WARNING] Reference not found for 'extractor.*.cookies' at configuration.rst_chunk_chunk line 1 column 47
[WARNING] Reference not found for 'keywords' at configuration.rst line 703 column 51
[WARNING] Reference not found for 'skip' at configuration.rst line 845 column 32
[WARNING] Reference not found for 'archive-format' at configuration.rst line 876 column 69
[WARNING] Reference not found for 'extractor.*.archive' at configuration.rst_chunk line 4 column 50
[WARNING] Reference not found for 'image-range' at configuration.rst line 1138 column 48
[WARNING] Reference not found for 'extractor.*.image-filter' at configuration.rst_chunk line 1 column 50
[WARNING] Reference not found for 'image-unique' at configuration.rst line 1189 column 50
[WARNING] Reference not found for 'extractor.*.image-filter' at configuration.rst_chunk line 11 column 31
[WARNING] Reference not found for 'extractor.*.cookies' at configuration.rst_chunk line 4 column 38
[WARNING] Reference not found for 'extractor.*.verify' at configuration.rst_chunk line 1 column 22
[WARNING] Reference not found for 'extractor.*.proxy' at configuration.rst_chunk line 1 column 21
[WARNING] Reference not found for 'extractor.*.skip' at configuration.rst_chunk_chunk line 2 column 56
[WARNING] Reference not found for 'extractor.*.archive' at configuration.rst_chunk line 2 column 34
[WARNING] Reference not found for 'extractor.*.archive' at configuration.rst_chunk line 2 column 34
[WARNING] Reference not found for 'extractor.*.archive' at configuration.rst_chunk line 2 column 34
[WARNING] Reference not found for 'extractor.*.directory' at configuration.rst_chunk line 4 column 48
[WARNING] Reference not found for 'extractor.*.image-filter' at configuration.rst_chunk line 5 column 44
[WARNING] Reference not found for 'extractor.*.skip' at configuration.rst_chunk_chunk_chunk line 1 column 66
AlttiRi commented 1 month ago

One second fix (with Notepad++):

image

The result: https://gist.github.com/AlttiRi/20e5442961f800f8d0f3f1d05d0535e2

No warnings, btw.

AlttiRi commented 1 month ago

But some underscores anyway should be deleted: image

fireattack commented 1 month ago

imageimage

imageimage

$ ./pandoc.exe -o configuration.md configuration.rst --from rst --to gfm
[WARNING] Reference not found for 'extractor.*.path-replace' at configuration.rst_chunk_chunk line 1 column 102
[WARNING] Reference not found for 'path-restrict' at configuration.rst line 247 column 47
[WARNING] Reference not found for 'extractor.*.archive' at configuration.rst_chunk line 3 column 69
[WARNING] Reference not found for 'retries' at configuration.rst line 379 column 52
[WARNING] Reference not found for 'extractor.*.cookies' at configuration.rst_chunk_chunk line 1 column 47
[WARNING] Reference not found for 'keywords' at configuration.rst line 703 column 51
[WARNING] Reference not found for 'tags' at configuration.rst line 4453 column 32

These warnings are about the wrong count of underscores (_).

Note that I also use --from rst --to gfm. https://pandoc.org/MANUAL.html#markdown-variants (gfm (Github-Flavored Markdown))

This exact error of "wrong rendering caused by no blank line after headings, when title has certain specific sequence of characters including asterisk" seems to be what GitHub currently has, too. I guess they may have switched from using sphinx to pandoc at some point?

And our current syntax isn't really wrong; just GitHub /pandoc's rendering implementation isn't up to spec.

According to https://docutils.sourceforge.io/docs/ref/rst/restructuredtext.html#toc-entry-10:

A blank line after a title is optional. All text blocks up to the next title of the same or higher level are included in a section (or subsection, etc.).

And here is the rules about when would asterisk would be recognized as literal instead of markup: https://docutils.sourceforge.io/docs/ref/rst/restructuredtext.html#inline-markup-recognition-rules

Where according to rule 6 & 7 (among others), our case of extractor.*.xxx should not be recognized as markup.

Previous discussion: https://github.com/orgs/community/discussions/86715#discussioncomment-9149986


Edit: obviously, if we can fix it by just adding blank lines, we can just workaround it as you suggested. I also made a ticket at pandoc repo.

mikf commented 1 month ago

As suggested, I added a newline after every option name heading in my local copy and wanted to use GitHub's edit and preview feature to see if it actually makes a difference, only to realize that docs/configfuration.rst now renders perfectly fine even without these changes.

… Unless I'm missing something, but it looks fine as far as I can tell. The Pandoc issue got fixed as well, but it hasn't been released yet, I don't think.

fireattack commented 1 month ago

Maybe they're using dev main version of pandoc (or at least cherry picking important commits in their downstream)? Assuming they're indeed using pandoc, of course.