usnistgov / jsip

JSIP: Java SIP specification Reference Implementation (moved from java.net)
Other
284 stars 129 forks source link

BUG in JAIN-SIP-DSP-2389 #2

Open francescoleone opened 8 years ago

francescoleone commented 8 years ago

Dear Sirs, We are using the library as per subject version and we are noticing a fault when the indicated library parse the body of SIP message MESSAGE (SMS). When in the body of SIP message there are characters equal or greater than 0x80 they are translated by library with 3F

Supposing that the body of SMS in input to the library is: 00 02 00 07 91 14 70 80 00 65 78 12 11 02 0c 91 24 60 80 32 84 86 00 00 ff 04 d4 f2 9c 0e The output from library is: 00 02 00 07 3f 14 70 3f 00 65 78 12 11 02 0c 3f 24 60 3f 32 3f 3f 00 00 3f 04 3f 3f 3f 0e 91,80,84,86,ff,d4,f2,9c -> 3F

I have searched among the library the known issues but was not able to find any reference.

vladimirralev commented 8 years ago

This doesn't look like a valid UTF8 encoding thus some symbols fail to map. You must be using some other unknown encoding, please check what is your system default encoding and charset, I suggest to force UTF8 as it's most portable. May be include a few examples of code and messages to see if something in JSIP is at fault here or it comes from elsewhere.

francescoleone commented 8 years ago

Hi guys,

Sorry for late reply, we were discussing internally.

What we want to point out is that this body of SIP message SMS we got is a real message coming from a well-known Customer network, it’s not an hypothetical value…

When in the body of SIP message there are characters equal or greater than 0x80 they are translated by library with 3F

Supposing that the body of SMS in input to the library is: 00 02 00 07 91 14 70 80 00 65 78 12 11 02 0c 91 24 60 80 32 84 86 00 00 ff 04 d4 f2 9c 0e (this is the real SIP message coming from well-known telco customer)

The output from library is: 00 02 00 07 3f 14 70 3f 00 65 78 12 11 02 0c 3f 24 60 3f 32 3f 3f 00 00 3f 04 3f 3f 3f 0e 91,80,84,86,ff,d4,f2,9c -> 3F

As you can see from the picture, it is a normal coded message, easily decoded by tool on line

Can you consider to investigate in the library ?

http://www.ericsson.com/ Ericsson

FRANCESCO LEONE Eng. PLM Frame Product Owner BUCI DUNC MC NDO RS IDAC

Ericsson Via Madonna di Fatima, 2 84016 Pagani (SA), Italy Phone +39 0815147497 Mobile +39 3336316206 Office +39 3336316206 Fax +39 0815147685 francesco.leone@ericsson.com www.ericsson.com http://www.ericsson.com/

http://www.ericsson.com/current_campaign http://www.ericsson.com/current_campaign

Legal entity: TEI, registered office in Pagani. This Communication is Confidential. We only send and receive email on the basis of the terms set out at www.ericsson.com/email_disclaimer

From: vladimirralev [mailto:notifications@github.com] Sent: martedì 7 giugno 2016 17:00 To: usnistgov/jsip Cc: Francesco Leone; Author Subject: Re: [usnistgov/jsip] BUG in JAIN-SIP-DSP-2389 (#2)

This doesn't looks like a valid UTF8 encoding thus some symbols fail to map. You must be using some other unknown encoding, please check what is your system default encoding and charset, I suggest to force UTF8 as it's most portable. May be include a few examples of code and messages to see if something in JSIP is at fault here or it comes from elsewhere.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/usnistgov/jsip/issues/2#issuecomment-224308463 , or mute the thread https://github.com/notifications/unsubscribe/AS4cRiniM8Cq30PbVxYb80WuR54ghweRks5qJYeFgaJpZM4Iv4N3 . https://github.com/notifications/beacon/AS4cRtXY6-rM7y0KobGaXrq5xunY84SWks5qJYeFgaJpZM4Iv4N3.gif

vladimirralev commented 8 years ago

The images or attachments in your email were filtered out by github so I still don't understand the problem. I see now that an external system sends you the messages, so that's a little hint. I think you may be using message.getContent() or some internal method which would attempt to decode the message using best-effort encoding (do you have a charset attribute in your message?) and you might get bad result. If that's the case make sure to use message.getRawContent() which should give you the original bytes. But even then the question about what encoding was used is unclear. You will need to find out in some way and decode the bytes. If this doesn't answer your question please post sample code and messages that I can test with.

francescoleone commented 8 years ago

the image filtered out is to show you that the sequence we got 00 02 00 07 91 14 70 80 00 65 78 12 11 02 0c 91 24 60 80 32 84 86 00 00 ff 04 d4 f2 9c 0e is easily decoded by the online tool for SMS decoding smsmessage_decoded

vladimirralev commented 8 years ago

Well, apparently it's encoded as UTF-16LE and the tool either auto-detects or assumes. But in any case, I just made a quick test and everything worked fine for me with message.getRawContent(). I see no problem. It will save a lot of time if you post your code, SIP messages so I can see if advertised encoding is correct and relevant context. I can suggest you run your java program with -Dfile.encoding=UTF-16LE which should be the right mode for you here, but it's still hard to understand what is the problem from this description.