OpenSIPS / opensips

OpenSIPS is a GPL implementation of a multi-functionality SIP Server that targets to deliver a high-level technical solution (performance, security and quality) to be used in professional SIP server platforms.
https://opensips.org
Other
1.28k stars 581 forks source link

[FEATURE] proto_smpp support Unicode (UTF-8 / 16) for sending out multi language SMS #1770

Open volga629 opened 5 years ago

volga629 commented 5 years ago

Proto SMPP will be nice if it will be able handle multi language support and special characters. Right now we tested version 3.0.0 and only English works.

volga629 commented 5 years ago

Test results

>>Sent>> Café >>Received>> Caf
>>Sent>> سلام >>Received>> 3D’E
>>Sent>> Привет >>Received>> @825B
>>Sent>> $ >>Received>> 2
>>Sent>> @ >>Received>> 2
outbound
>>Sent>> Café >>Received>> Caf
>>Sent>> سلام >>Received nothing>> 
>>Sent>> Привет >>Received nothing>>
>>Sent>> $ >>Received>> ¤
>>Sent>> $ >>Received>> ¡
I
>>Sent>> montréal >>Received>> montr al
vladpaiu commented 5 years ago

Hello,

Will take a shot at this - just wanted to confirm some implementation details. For the type of SMPP output, will simply rely on the SIP Content Type header. If missing or specifying no charset, will use the standard SMSC charset. Otherwise, if utf-16 charset is present in the Content Type header, the UCS2 format will be used for sending to the SMSC.

For the encoding in the actual body of the SIP message, will try & use direct binary representation, but will also have to implement %x standard escaping, since it would be quite useful in various situations . ie when the SMS messages need to be generated by OpenSIPS ( eg. via t_uac_dlg MI ).

Regards, Vlad

volga629 commented 5 years ago

Hello Vlad, That great news, thank you. Right now when send MESSAGE before conversion to SMS SMPP we will need inject Content Type header with encoding ? Can you give some example

vladpaiu commented 5 years ago

Hello,

Just committed https://github.com/OpenSIPS/opensips/commit/d724cec9c25a7ad19cdd805d7d92ce63c12b2b87 for support of UCS2.

When receiving a SIP message which needs to be converted to UCS2, it must contain the following indicator : Content-Type:text/plain; charset=UTF-16 The body will need to contain the HEX representation of the UTF-16 text that needs to be sent over SMPP. For example, garçon needs to come as 00670061007200e7006f006e

Tested the SIP to SMPP part of this and it seems to work fine.

For SMPP to SIP, I can't do any testing, but the behaviour should be pretty similar, if UCS2 coding is used in the SMPP, then the SIP will come out with Content-Type:text/plain; charset=UTF-16 and have the body a hex - encoded version of the UTF-16 text.

If you could test and give some feedback, that would be perfect.

Best Regards, Vlad

volga629 commented 5 years ago

I am going test everything. I wonder if I can get diff so I can apply to 3.0 release.

vladpaiu commented 5 years ago

Hello,

Sure, you can get that from github from any commit, simply by adding .patch at the end of the URL : https://github.com/OpenSIPS/opensips/commit/d724cec9c25a7ad19cdd805d7d92ce63c12b2b87.patch

Note that the commit also adds another param to the send_smpp_message script function, allowing to request delivery receipts for the sent messages.

Best Regards, Vlad

volga629 commented 5 years ago

I am testing Inbound sms from provider with multi language and look like message decoding issue. When come from smpp to sip simple MESSAGE body is always as encoded. Also missing space bellow.

2019/09/11 11:53:42.579934 207.35.127.183:5060 -> 207.35.127.183:5060
MESSAGE sip:14165486501@207.35.127.183:34827 SIP/2.0
Via: SIP/2.0/UDP 207.35.127.183:5060;branch=z9hG4bKef94.1392e0b4.0
To: sip:14165486501@207.35.127.183:34827
From: <sip:14168586001@205.205.22.26:2775>;tag=5ad3f80f16abe80e2341d7496787ef55-dba5
CSeq: 10 MESSAGE
Call-ID: 7515f7062ac3daed-28116@207.35.127.183
Max-Forwards: 70
Content-Length: 12
User-Agent: OpenSIPS (3.0.0 (x86_64/linux))
Content-Type:text/plain; charset=UTF-16    <--- **Missing Space after Content-Type **
X-Node-ID: 46903HtW21

041f04400438
vladpaiu commented 5 years ago

Hello,

Can you mail me the incoming SMPP & the outgoing SIP to vladpaiu@opensips.org ?

Best Regards, Vlad

volga629 commented 5 years ago

I sent you pcap

johandeclercqdemocon commented 5 years ago

Our provider sends and receives in GSM 7-bit Default Alphabet . How can I handle that ?

volga629 commented 5 years ago

Hello Vlad, Sorry for delay was pulled away to other tasks, but good news that I built dev environment with few test did for sms and opensips node, going do full testing and report back.

volga629 commented 5 years ago

Finally I got trunks working Here example Hello word down stream to SMS provider

Nov 11 15:33:37 dev1-fr /usr/sbin/opensips[7950]: SMS_ROUTE: Got  ext number looking for correct sms gateway
Nov 11 15:33:37 dev1-fr /usr/sbin/opensips[7950]: Long distance SMS destination number is ~> []
Nov 11 15:33:37 dev1-fr /usr/sbin/opensips[7950]: ERROR:proto_smpp:hex2int: 'H' is no hex char
Nov 11 15:33:37 dev1-fr /usr/sbin/opensips[7950]: ERROR:proto_smpp:hex2int: 'l' is no hex char
Nov 11 15:33:37 dev1-fr /usr/sbin/opensips[7950]: ERROR:proto_smpp:hex2int: 'l' is no hex char
Nov 11 15:33:37 dev1-fr /usr/sbin/opensips[7950]: ERROR:proto_smpp:hex2int: 'o' is no hex char
Nov 11 15:33:37 dev1-fr /usr/sbin/opensips[7950]: ERROR:proto_smpp:hex2int: ' ' is no hex char
vladpaiu commented 4 years ago

Hey,

For incoming messages, the format needs to be the same as the outgoing of opensips, what I've mentioned in earlier messages "The body will need to contain the HEX representation of the UTF-16 text that needs to be sent over SMPP. For example, garçon needs to come as 00670061007200e7006f006e" ( 0067 being g, 0061 being a, etc see full UCS encoding here http://www.columbia.edu/kermit/ucs2.html )

volga629 commented 4 years ago

So how is should look like MESSAGE before send to SMPP provider ?

volga629 commented 4 years ago

Do we need convert body manually and set Content-Type:text/plain; charset=UTF-16

johandeclercqdemocon commented 4 years ago

I believe so.

Outlook voor iOShttps://aka.ms/o0ukef downloaden


Van: volga629 notifications@github.com Verzonden: woensdag, november 20, 2019 2:59 AM Aan: OpenSIPS/opensips CC: johandeclercqdemocon; Comment Onderwerp: Re: [OpenSIPS/opensips] [FEATURE] proto_smpp support Unicode (UTF-8 / 16) for sending out multi language SMS (#1770)

Do we need convert body manually and set Content-Type:text/plain; charset=UTF-16

— You are receiving this because you commented. Reply to this email directly, view it on GitHubhttps://github.com/OpenSIPS/opensips/issues/1770?email_source=notifications&email_token=AKDSPWIO42IX75EUWZHTQULQUSKYJA5CNFSM4ID42VEKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEEQNOVQ#issuecomment-555800406, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AKDSPWOMAO6HREE4TU7ZWIDQUSKYJANCNFSM4ID42VEA.

volga629 commented 4 years ago

Is something like this

$var(msg) = $(rb{s.encode.hexa})
vladpaiu commented 4 years ago

Hey,

s.encode.hexa won't really work, since it just assumes regular ASCII encoding, eg. letter 'v' will be encoded as 76 instead of UCS 0076. We should maybe add an s.encode.ucs transformation for this purpose.

volga629 commented 4 years ago

Hello Vlad,Yes, will be nice to add this transformation otherwise will require introduce external script. Sent from mobile device typos are expected. From: notifications@github.comSent: November 21, 2019 07:14To: opensips@noreply.github.comReply-to: reply@reply.github.comCc: volga629@networklab.ca; author@noreply.github.comSubject: Re: [OpenSIPS/opensips] [FEATURE] proto_smpp support Unicode (UTF-8 / 16) for sending out multi language SMS (#1770)

Hey,

s.encode.hexa won't really work, since it just assumes regular ASCII encoding, eg. letter 'v' will be encoded as 76 instead of UCS 0076. We should maybe add an s.encode.ucs transformation for this purpose.

-- You are receiving this because you authored the thread. Reply to this email directly or view it on GitHub: https://github.com/OpenSIPS/opensips/issues/1770#issuecomment-557038340

volga629 commented 4 years ago

Can we use some thing like iconv as work around ? Do you want me open separate ticket for new transformation ?

vladpaiu commented 4 years ago

Hello,

Test first with an external script so that we can first close this issue and then we can talk about implementing the transformation.

volga629 commented 4 years ago

We build script mix of lua python to convert, but seems lua in opensips have some issue where it set userdata instead string and it not provide metadata for userdata

Opensips script

  xlog("SMS encode UTF16 with body $rb\n");
                        if(lua_exec("arg_function", "encode:UTF16:$rb")) {
                                xlog("Encoded body to UTF-16 ~> [$avp(msg-dst)]\n");
                        }

Opensips Lua output

Dec  5 20:13:32 dev1-fr /usr/sbin/opensips[2631]: siplua: python string : userdata: 0x7f5e54b3f0a0
Dec  5 20:13:32 dev1-fr /usr/sbin/opensips[2631]: siplua: python string : userda
Dec  5 20:13:32 dev1-fr /usr/sbin/opensips[2631]: siplua: python string : a: 0x7f5e54b3f0a0
Dec  5 20:13:32 dev1-fr /usr/sbin/opensips[2631]: siplua: python string :
tostring(userdata)

Creator of that userdata must provide __tostring metamethod.
volga629 commented 4 years ago

@vladpaiu We right now blocked, because lua not working as expected in opensips. I asked in mailing list, but no reply so far. Can we introduce conversion with transformation as originally we talk about ? It just will save to everybody tons of time and resources.

volga629 commented 4 years ago

will be nice to have as start

s.encode/decode.ucs/utf16/gsm7

volga629 commented 4 years ago

That what supported by provider

A SMS message must be contained in 140 bytes as this is the total payload available to send a single message. i.e.: GSM 7-bit alphabet can pack 160 characters out of this (7-bits x 160 = 1,120 bits /8 = 140 bytes). UTF-8 allows a maximum of 140 characters while it is only 70 for an UTF-16/UCS-2 encoded message.

In MESSAGE we supply

add_body_part("$avp(formatted-msg)", "text/plain; charset=UTF-16");

encoded message

Call-ID: 5hx6vvbb5JW-qO-cAeo_zA..
CSeq: 2 MESSAGE
Allow: INVITE, ACK, CANCEL, OPTIONS, BYE, REFER, NOTIFY, SUBSCRIBE, UPDATE, INFO, MESSAGE
Content-Type: text/plain; charset=UTF-16
User-Agent: C-Tel-NL
Content-Length: 17

//5oAGUAbABsAG8A

Is this correct ? because in dump of smpp link we see

Frame 19: 132 bytes on wire (1056 bits), 132 bytes captured (1056 bits)

Data coding: 0x08 SMPP Data Coding Scheme: UCS2 (ISO/IEC-10646) (0x08)

volga629 commented 4 years ago

@vladpaiu We right now blocked, because lua not working as expected in opensips. I asked in mailing list, but no reply so far. Can we introduce conversion with transformation as originally we talk about ? It just will save to everybody tons of time and resources.

OK that being resolve, but we need verification on encoding/decoding

volga629 commented 4 years ago

OK We figure out all encoding decoding I committed all details right here are

https://github.com/VoIP-SAAS/opensips-smpp-lua

Please help to test and any comments welcome.

volga629-1 commented 4 years ago

@vladpaiu I am got all routing working single line sms is encoding and decoding, but I tried implement msilo for offline store inbound sms and issue that I can’t encode or decode multi line sms. How I should handle this ?

volga629-1 commented 4 years ago

some thing like msilo offline message

 $avp(msg) = "SMS Mailer:\n" + "Sub ID ~> " + $fU + ".\n" + "Delivery Time ~> " + $time(%T) + ".\n";
volga629-1 commented 4 years ago

Test first with an external script so that we can first close this issue and then we can talk about implementing the transformation.

Hello @vladpaiu External script is working. We have encoding decoding Script repository

https://github.com/VoIP-SAAS/opensips-smpp-lua

What next ? Will be nice to add transformation for encoding decoding.