Parsing of HTML formated texts

albrechtf / mcf2pdf

"My CEWE Photobook" MCF to PDF converter

Other

40 stars 37 forks source link

Will be fixed in next release - I improved the regular expression for extracting these paragraphs. Still, a much cleaner solution would be an HTML parser, but as it could be HTML (not XHTML, not XML), this would require another external library...

For the chinese letters - yes, confirmed, they are not displayed. MCF marks them as "Arial" font in your example, and usual Arial fonts obviously do not include chinese letters. If I try to copy them from the source of your MCF file into MS Word, I cannot select "Arial" as font for them. This will be listed as known issue for now.

albrechtf / mcf2pdf

Parsing of HTML formated texts #6