Closed joka closed 14 years ago
Hi Joka. Sorry for the slow response, I'm away from home.
\f0 is a standard control word that the RTF reader normally handles. From your example, I'm not sure what problem it's having.
I don't see a way to attach files here, so could you email your whole RTF file to me at brendonh@gmail.com ? I'll figure out what's tripping it up.
Cheers, Brendon
ok fine, and thank you for pyth, it's really nice to have an pythonic rtf reader.
I think I've fixed this (in trunk). Pyth was ignoring font declarations that didn't have a \fcharset. Now they default to the reader's charset (e.g. from the initial \ansi) instead, which I think is the right thing to do -- the spec isn't clear.
It seems to work for your example doc, anyway.
Im using rtf files generated by pandoc. They have a lot of "\f0" control words (no idea why).
/plugins/rtf15/reader.py cannot read these files because of this "\f0" word.
For a general solution, could you skip unknown control words?
Example rtf: {\rtf\ansi\deff0{\fonttbl{\f0\froman Tms Rmn;}{\f1\fdecor Symbol;}{\f2\fswiss Helv;}}{\colortbl;\red0\green0\blue0; \red0\green0\blue255;\red0\green255\blue255;\red0\green255\ blue0;\red255\green0\blue255;\red255\green0\blue0;\red255\ green255\blue0;\red255\green255\blue255;}{\stylesheet{\fs20 \snext0Normal;}}{\info{\author John Doe} {\creatim\yr1990\mo7\dy30\hr10\min48}{\version1}{\edmins0} {\nofpages1}{\nofwords0}{\nofchars0}{\vern8351}}\widoctrl\ftnbj \sectd\linex0\endnhere \pard\plain \fs20 This is plain text.\