Closed thth closed 3 years ago
Thanks @thth for finding and reporting this issue (on top of the other 3)! I will take a look at your remaining prs this weekend 🙌🏼
Thanks! By the way, the hacktoberfest-accepted
tags are spelt wrong 😅
Thanks! By the way, the
hacktoberfest-accepted
tags are spelt wrong 😅
Whoops! Thanks for raising that - I just went through and updated the tags by adding a new hacktoberfest-accepted
label and then deleted the hacktoberest-accepted
label!
Table of contents weren't being generated for Bengali or Tamil, due to the regex for parsing headers not accounting for characters which would fall under the Unicode general category for marks.
I don't know Bengali, Tamil, or Unicode, so I have no idea why these languages have characters distinguished as marks but not others 🤷
I'm also not sure why the regex for recognizing headers, now
~r/<(h\d)>(["\p{L}\p{M}\s?!\.\/\d]+)(?=<\/\1>)/iu
, is so specific. Perhaps it would be good to generalize it some? Currently it wouldn't catch any transcendent numerals, symbols, or punctuation (the other Unicode general categories).