text = text.lower().replace(" ", "-")
text = re.compile(r"[`~!@#$%^&*()+=<>?,./:;"'|{}\[\]\\–—]").sub("", text)
text = re.compile(r"[ 。?!,、;:“”【】()〔〕[]﹃﹄“”‘’﹁﹂—…-~《》〈〉「」]").sub("", text) # CJK punctuation
return text
Additional context:
Relying on GitHub's tricky algorithm may be a bad idea, and we can use category IDs.
I tried to make a PR, but title_md_prefix: str = "##" is not compatible to <h2>. However, no code calls them with title_md_prefix. Can I make it private?
Describe the bug:
If I have a non-English category (e.g.
类
), then the TOC will be generated as[类](#)
, whosehref
is empty.Expected behaviour:
Render it as
[类](#类)
or[类](#category-id)
.Steps to reproduce the issue:
(I've described it above)
👇
[网站](#)
.👇 GitHub's anchor is
#网站
.Technical details:
Possible Fix:
Our
process_md_link
differs from GitHub's.https://github.com/best-of-lists/best-of-generator/blob/4e07c02a36d964c28ceab6de53c74be84a633286/src/best_of/generators/markdown_list.py#L486-L488
GitHub's algorithm is not documented, but people have discussed it at https://gist.github.com/asabaylus/3071099. In short, CJK and other Unicode characters matter.
https://gist.github.com/asabaylus/3071099?permalink_comment_id=1593627#gistcomment-1593627
https://gist.github.com/asabaylus/3071099?permalink_comment_id=2563127#gistcomment-2563127
Additional context:
Relying on GitHub's tricky algorithm may be a bad idea, and we can use category IDs.
(
<a id />
trick does not work.)I tried to make a PR, but
title_md_prefix: str = "##"
is not compatible to<h2>
. However, no code calls them withtitle_md_prefix
. Can I make it private?https://github.com/best-of-lists/best-of-generator/blob/4e07c02a36d964c28ceab6de53c74be84a633286/src/best_of/generators/markdown_list.py#L334-L336
https://github.com/best-of-lists/best-of-generator/blob/4e07c02a36d964c28ceab6de53c74be84a633286/src/best_of/generators/markdown_list.py#L437-L439