jung-kurt / gofpdf

A PDF document generator with high level support for text, drawing and images
http://godoc.org/github.com/jung-kurt/gofpdf
MIT License
4.34k stars 787 forks source link

Problem when writing '<' with HTMLBasicNew #200

Open SC-JC opened 6 years ago

SC-JC commented 6 years ago

I have a Webinterface where users can write formatted Text that gets saved as HTML snippets.

I use the library to write said snippets into a PDF file, which works pretty decent, as long as i replace some HTML tags with those the library can interpret.

It gets tricky when a user writes a '<' char in the snippet cause then everything up to the next HTML tag gets removed, including the '<' char.

I think it's because the library tries to interpret it as one HTML tag, even though it isn't one.

If it's at the end of the snippet, where no new HTML tag follows, it gets printed without a problem.

Example: html.write("1 < 2<br />") results in following text printed into the PDF: 1

html.write("1 < 2") results in following text printed into the PDF: 1 < 2

jung-kurt commented 6 years ago

The comment in the source code says it all:

This is done with regular expressions, so the result is only marginally better than useless.

Probably the best solution would be to use a grown-up HTML parser but that introduces a dependency in an otherwise dependency-free package. For such limited support of HTML, maybe the best solution is to look for only the tags that the package actually cares about and treat the rest as literal. Let's keep this issue open until we do that or come up with another solution.