brendonh / pyth

Python text markup and conversion
MIT License
89 stars 79 forks source link

Handle #PCDATA in font table correctly #31

Closed yairchu closed 9 years ago

yairchu commented 9 years ago

According to the spec font names should be handled as opaque strings. This adds a isPcData attribute to groups and sets it according to the spec for font_table.

This fixes the parsing of many common RTFs found in the wild, which contain the substring: "{*\falt ?l?r ??\'81\'66c}" Example RTF file: http://www.knesset.gov.il/privatelaw/data/19/3/687_3_1.rtf