Closed happy-barney closed 2 years ago
This PR doesn't contain documentation (yet). Consider it as a POC how $subj can be implemented.
+1 for separate CPAN module under Encode::
namespace.
Module argument works for GSM charset as well.
I do not intend to do anything else apart sharing source. It's up to you to decide what to do with it (ie, give perl some competitive advantage ...)
and about usage of these tables ... depends on country. Obviously it is not very likely to receive Hindu message in Europe ...
For example, one project in Germany I participated on received around 2% of messages with Turkish language set.
I'm surprised that some devices are still generating SMSs in national sets instead of universal UNICODE/UCS-2.
@pali don't be. SMS can contain 160 character / 140 bytes. Choosing national sets consumes 1 + (3 per base) + (3 per shift) bytes leaving space for 155 (resp 152) characters whereas UCS-2 is strictly 16 bits per char = 70 chars.
You can send longer messages but that takes another continuation UDH (4 bytes).
As a result, UCS-2 message is twice as expensive as GSM charset message.
I know, I have read and have implemented TS 123 038 over TS 123 040 over ES 201 912 over V.23 over RFC3261.
Just I have not seen mobile devices which generares SMSs in National sets anymore...
Note that in National sets there are also characters behind escape sequence and for their usage you need to use 2 bytes (like in UCS-2). But usage of characaters in primary (non-escape part) is really decrease size of SMS.
Most likely observer bias due fact you are not living in country where supported language is used (eg: Turkish or Hindi)
PR #149 reminded me work I started few years ago.