Closed belisoful closed 1 year ago
While I was sleeping on this PR, it came to me that if $lang is null, to check the $from/$to for a '.', and then pull the encoding and lang apart.
Basically, the encoding can have ".fr" appended to it to designate the French language of the encoding.
I have a IPTC class for reading and writing IPTC that will be using this updated TUtfConverter and Esc charset converter at some point..
BTW, PHP has weak support for reading and writing IPTC. The class i have does much better and encodes the various constants for field names/ids that would otherwise be up to each implementation.
While I was sleeping on this PR, it came to me that if $lang is null, to check the $from/$to for a '.', and then pull the encoding and lang apart.
Looking at the output of iconv -l
i can see that some charset already contains a dot in their name.
It looks like they are all quite exotic:
ANSI_X3.4-1968, ANSI_X3.4-1986, ANSI_X3.4, ANSI_X3.110-1983, ANSI_X3.110 CSA_Z243.4-1985-1, CSA_Z243.4-1985-2, CSA_Z243.419851, CSA_Z243.419852 ISO_646.IRV:1991, JUS_I.B1.002, MSZ_7795.3, T.61-8BIT, T.61, T.618BIT, TIS620.2529-1, TIS620.2533-0 I guess we can live without these.. LGTM
these functions add the $lang parameter for setting the PHP setLocale(LC_CTYPE, $lang) because various countries/languages have slightly different character sets despite being the same encoding. eg ASCII has different national standards.
This is the most comprehensive list of ESC character set encodings i was able to find in reasonable time.