Closed frasertweedale closed 5 years ago
The windows-1252 is one I run into a lot. I do feel text-icu is quite a big dependency for some of the common ones. I wonder how much it takes to support some of the common ones like windows-1252 and for the rest use text-icu?
I've got upcoming, changes to both purebred-email and purebred to support this, as well as our first official plugin, purebred-icu :)
Currently we only support
us-ascii
,iso-8859-1
andutf-8
charsets. But there are many more common charsets. Found in a corpus of my personal email were:And there are undoubtedly many more we need to support.
text-icu package is a binding to libicu with support for all common charsets. It does some things impurely (namely, loading converters). And its precise behaviour w.r.t. unrecognised charset names is not clear from the docs.
I'm unsure if we would want purebred-email to depend on text-icu, or if we're better off having pluggable charset support and a supplementary module for bringing the "expanded suite" via text-icu or some other means.