Open twjjack opened 7 years ago
Support for unicode characters was recently added. Please recheck if the problem persists. Thanks
Attempting to parse mixed English/Cantonese documents. Cantonese is being garbled. Also receiving a great deal of output such as
...
WORD rtf (1)
WORD adeflang (1025)
WORD ansi (1)
WORD ansicpg (1252)
WORD uc (1)
WORD adeff (0)
WORD deff (0)
...
Unfortunately far eastern languages (UTF-16 & UTF-32) not yet implemented ! you can help us by uploading your RTF file thanks
Hi all,
Thanks! It is working perfectly to get the English wordings but it is not working when the RTF contains Chinese characters which are being store in unicode.
Here is my code: $rtf = '{\rtf1\ansi\ansicpg1252\uc0\deff0{\fonttbl {\f0\fswiss\fcharset0\fprq2 Arial;} {\f1\fnil\fcharset0\fprq2 SimSun;} {\f2\froman\fcharset2\fprq2 Symbol;}} {\colortbl;\red0\green0\blue0;\red255\green255\blue255;} {\stylesheet{\s0\itap0\f0\fs24 [Normal];}{*\cs10\additive Default Paragraph Font;}} {*\generator TX_RTF32 11.0.401.501;} \deftab1134\paperw11907\paperh16443\margl567\margt567\margr567\margb567\pard\itap0\plain\f1\fs20\loch\f1\hich\f1\u20320\u22909\u21527\par }';
$result = $reader->Parse($rtf); $formatter = new RtfHtml(); $test = $formatter->Format($reader->root);
and it give me this result: ◊u22909◊par
I am expecting to get the result of \u20320\u22909\u21527\ which I can then translated it back to Chinese character.
Is there any one here have similar issue and what is the solution?
Cheers, Jack