reingart / pyfpdf_googlecode

Automatically exported from code.google.com/p/pyfpdf
GNU Lesser General Public License v3.0
0 stars 0 forks source link

Ampersands in table cells cause strange spacing #51

Open GoogleCodeExporter opened 9 years ago

GoogleCodeExporter commented 9 years ago
What steps will reproduce the problem?
Make a table cell with an ampersand in the text, i.e <td>yes & no</td>

What is the expected output? What do you see instead?
Should get a table cell with "yes & no":

STARTTAG td {}
box_shadow 77.0487623762 5.50333333333 None
td cell 35.6841707921 77.0487623762 yes & no L  *
ENDTAG td

But instead each word gets its own table cell:

STARTTAG td {}
box_shadow 77.0487623762 5.50333333333 None
td cell 35.6841707921 77.0487623762 yes L  *
box_shadow 77.0487623762 5.50333333333 None
td cell 112.732933168 77.0487623762 & *
box_shadow 77.0487623762 5.50333333333 None
td cell 189.781695545 77.0487623762  no *
ENDTAG td

Original issue reported on code.google.com by marc.pfi...@gmail.com on 19 Feb 2013 at 9:24

GoogleCodeExporter commented 9 years ago
This is easily fixed by preformatting the html code: 
After you open and read the html code into a variable like this

    with open(infile,'r') as f: 
        text = f.read(-1)
you add this (you will need "import re"):
    text = re.sub('&','@amp',text)

Now in the handle_data method in the Class HTML2FPDF(HTMLParser) you add this 
line at the very beginning of the method:
    txt = re.sub('@amp','&',txt)

And the problem is solved. The problem is caused by the entityref method not 
being implemented

Original comment by fredpett...@gmail.com on 16 Jul 2013 at 10:59