Open wismill opened 2 weeks ago
However, it is not clear to me if
ghc-internal
could depend onunicode-data-core
as well, as we do not want the compiler to fix the Unicode version.
That sounds unlikely to me, ghc-internal
is not meant to be reinstallable.
Having unicode-data-core
as a separate package sounds like a good idea to me because unicode-data
is too big and might grow with more stuff. But we also need to figure out how GHC can share it so that the maintenance effort is reduced.
Currently the Unicode version in
base
is not upgradable because it depends of the GHC version via theghc-internal
package.This raises two main issues:
text
has case mappings from Unicode 14.0 but uses alsoData.Char
, which may have a different Unicode version (base-4.20
uses Unicode 15.1).The
unicode-data
(Hackage) package family offers a way to choose an exact Unicode version and access to Unicode features unavailable inbase
. Some of its core features were merged inbase
(see #59).I propose that we go further and decouple the Unicode version from GHC version, by introducing a new core library
unicode-data-core
that would backData.Char
. Its code would be the one currently inGHC.Internal.Unicode*
(probably under another namespace), with the optional addition of the complex case mappings fortext
and any other basic feature deemed useful for core libraries.Such package would have low maintenance effort: Unicode publishes versions on a yearly basis and the API is very stable.
However, it is not clear to me if
ghc-internal
could depend onunicode-data-core
as well, as we do not want the compiler to fix the Unicode version.CC
unicode-data
team: @adithyaov @Bodigrim @harendra-kumar