Closed manoj535 closed 3 years ago
Many thanks for the suggestion, I will get this into the code in the next few days. Would you be able to construct a sample RTF file which contains the relevant characters so I can add it as a test case?
Thanks for the response. Please find below the rtf string and corresponding decoding 1) rtf = "{\rtf1\ansi\ansicpg932\deff0\deflang1033\deflangfe1041{\fonttbl{\f0\fnil\fcharset0 MS Sans Serif;}{\f1\froman\fprq1\fcharset128 MS UI Gothic;}} {\colortbl ;\red255\green0\blue0;\red0\green0\blue255;} \viewkind4\uc1\pard\cf1\lang1041\f0\fs17 BLC U=>L Splice \f1\fs18\'82\'c5U/W No.2 Dancer\'82\'a9\'82\'e7\'83\'56\'83\'8f\'94\'ad\'90\'b6\'81\'42Set\'8e\'9e\'82\'c9\'95\'5c\'91\'7710\'87\'6f\'82\'d9\'82\'c7\'83\'80\'81\'5b\'83\'6a\'83\'93\'83\'4f\'81\'40pallet\'92\'ea\'82\'ccRoll\cf2\f0\fs17 \par }" decoded string = " BLC U=>L Splice でU/W No.2 Dancerからシワ発生。Set時に表層10㎜ほどムーニング pallet底のRoll" ㎜ -> NEC special character here
Thanks for that. I didn't merge your branch directly in the end, but the change and related test case are now in place. I've credited you in the release not on GitHub hope that's ok!
Thats fine. Thanks for merging the changes.
With current mapping of LOCALEID_MAPPING.put("932", "SJIS"); NEC characters like ㎜,①,② are not decoding properly. Check https://en.wikipedia.org/wiki/JIS_X_0208#0x2D for NEC special characters
As, MS932 supports NEC special characters including SJIS , modified the current mapping of 932 to MS932 as LOCALEID_MAPPING.put("932", "MS932");
This change fixes the issue with decoding of NEC special characters in rtfparser