pyfpdf html to pdf mis-handles '<', '>', '&' and ' in tables.

GoogleCodeExporter commented 9 years ago

What steps will reproduce the problem?
1.The html to pdf routine inserts a new cell when it sees a lt, gt, amp, or 
apos. This causes incorrect PDF output that adds a column to the table when one 
of thes characters is encountered. For example, the following table in HTML

row 1   It's a boy  Wt > 4 kg   len < 40cm
row 2   boy & girl  Wt > 3kg    len < 35 cm
is converted to the following PDF

row 1   It  s a boy Wt
row 2   boy girl    Wt
2.
3.

What is the expected output? What do you see instead?
The HTML produces:
--------------------------------------------------------------------
| row 1       | It's a boy     |  Wt > 4 kg     | len < 40cm       |
|-------------|----------------|----------------|------------------|
| row 2       | boy & girl     |  Wt > 3kg      | len < 35 cm      |
--------------------------------------------------------------------
The PDF produces:
--------------------------------------------------------------------
| row 1       | It             | s a boy        | Wt               |
|-------------|----------------|----------------|------------------|
| row 2       | boy            | girl           | Wt               |
--------------------------------------------------------------------
What version of the product are you using? On what operating system?
2.9 on linux, windows, OSX

Please provide any additional information below.

The essential code is at https://github.com/rjwarg/pyfpdf_hack.git. The README 
has a brief summary.

Original issue reported on code.google.com by rjw...@gmail.com on 17 Nov 2014 at 8:12

GoogleCodeExporter commented 9 years ago

Possibly a duplicate of Issue 51?

Original comment by vadm...@gmail.com on 18 Mar 2015 at 5:10

Added labels: ****
Removed labels: ****

GoogleCodeExporter commented 9 years ago

Dude-

Yes it is, except that it proposes a change to html.py which abstracts the
problem to a lower level, handles a more complete set of characters and
eliminates the need to insert 'mystery code' into the application itself to
work around the otherwise undocumented (and undesirable) behavior.

I have replace the the stock html.py with this modified and forgot the
problem exists until I upgraded to a later version of web2py.

The essential code is at https://github.com/rjwarg/pyfpdf_hack.git.
The README has a brief summary.

Richard

Original comment by rjw...@gmail.com on 18 Mar 2015 at 2:09

Added labels: ****
Removed labels: ****

ReptarX / pyfpdf

pyfpdf html to pdf mis-handles '<', '>', '&' and ' in tables. #80

ReptarX / pyfpdf

pyfpdf html to pdf mis-handles '<', '>', '&' and &apos; in tables. #80

pyfpdf html to pdf mis-handles '<', '>', '&' and ' in tables. #80