yuin / goldmark

:trophy: A markdown parser written in Go. Easy to extend, standard(CommonMark) compliant, well structured.
MIT License
3.68k stars 255 forks source link

Update case folding map to unicode 14.0.0 #264

Closed kpym closed 2 years ago

kpym commented 2 years ago

In gen-unicode-case-folding-map.go the Unicode version is hard encoded to 12.1.0 but the last version is 14.0.0 and it contains some (minor) changes.

May be it is a good idea to recover the last version automatically. I was not able to find an easy way to do this. But this can be done by fetching https://www.unicode.org/versions/latest/ which returns a 302 status code (redirection) pointing to the latest version url https://www.unicode.org/versions/UnicodeXX.Y.Z/.

kpym commented 2 years ago

I asked unicode.org how I can get the latest version of CaseFolding.txt and Rick McGowan's answer was very satisfactory:

The "latest" link you are probably looking for is this one:

https://www.unicode.org/Public/UCD/latest/

That is a stable URL that we do maintain.

Example:

https://www.unicode.org/Public/UCD/latest/ucd/CaseFolding.txt

The "latest" changes approximately once per year when a version of The Unicode Standard is released. There is little to be gained by checking that link or contents very often... so please do not add heavyweight access that runs frequently to download files that probably have not changed. Perhaps you were thinking of /polling /to check file dates once in a while.

So it should be enough to replace http://www.unicode.org/Public/12.1.0/ucd/CaseFolding.txt by https://www.unicode.org/Public/UCD/latest/ucd/CaseFolding.txt in gen-unicode-case-folding-map.go.

yuin commented 2 years ago

I've just replaced the URL in the gen-unicode-case-folding-map.go.

Using latest URL may cause unexpected breakings, so I've replaced with the explicit versioned URL.