Character encoding issues around non-US English

remcoboerma / pyfpdf

Automatically exported from code.google.com/p/pyfpdf

GNU Lesser General Public License v3.0

0 stars 0 forks source link

Character encoding issues around non-US English #31

Open GoogleCodeExporter opened 9 years ago

GoogleCodeExporter commented 9 years ago

This is a great wrapper around the pdf libs, and I particularly like that it is 
standalone.

I am having issues using it for non-English words. I've included a small code 
snippet that attempts to encode the Spanish word año for year.

Instead of the expected output, I am seeing: aÃ±o

I am using version 1.54b on OSX 10.6.8 with Python 2.6.1

SAMPLE CODE TO REPRODUCE ISSUE

#!/usr/bin/python
# -*- coding: latin-1 -*-

#import os
from pyfpdf import FPDF

#font = 'ISO-8859-1'
font = 'courier'                                                                

pdf = FPDF()                                                                    

pdf.add_page()                                                                  

pdf.set_font(font, 'B', 16.0)                                                   

pdf.set_xy(95.0,18.0)                                                           

pdf.cell(ln=0, h=10.0, align='C', w=75.0, txt='año', border=1)                 

pdf.set_font(font, '', 12.0)                                                    

pdf.set_xy(95.0,28.0)                                                           

pdf.cell(ln=0, h=10.0, align='R', w=75.0, txt='year', border=0)                 

pdf.output('SpanishWord.pdf', 'F')

Original issue reported on code.google.com by jon.hei...@gmail.com on 29 Sep 2012 at 7:26

GoogleCodeExporter commented 9 years ago

You should check your file is saved in UTF-8 (some editors doesn't handle 
encoding correctly)

aÃ±o seems not to be in latin-1

Can you attach the file?

I should recommend you to use utf-8 directly, changing:

# -*- coding: utf-8 -*-

and then

pdf.cell(ln=0, h=10.0, align='C', w=75.0, txt=u'año', border=1)

(note the u in front of the string)

Latin-1 (ISO 8859-1) should not be a problem, but you can check specific 
unicode support here:

http://code.google.com/p/pyfpdf/wiki/Unicode

Original comment by reingart@gmail.com on 30 Sep 2012 at 4:12

GoogleCodeExporter commented 9 years ago

Thanks for the response. Here is the error I am seeing now:

~/pyfpdf/fpdf.py:393: UnicodeWarning: Unicode equal comparison failed to 
convert both arguments to Unicode - interpreting them as being unequal
  w += cw.get(s[i],0)
Traceback (most recent call last):
  File "issue.py", line 20, in <module>
    pdf.output('SpanishWord.pdf', 'F')
  File "~/pyfpdf/fpdf.py", line 871, in output
    self.close()
  File "~/pyfpdf/fpdf.py", line 286, in close
    self._enddoc()
  File "~/pyfpdf/fpdf.py", line 1188, in _enddoc
    self._putpages()
  File "~/pyfpdf/fpdf.py", line 964, in _putpages
    p = zlib.compress(self.pages[n])
UnicodeEncodeError: 'ascii' codec can't encode character u'\xf1' in position 
85: ordinal not in range(128)

I am using vim which handles file encoding well, however I may not be handling 
the encoding correctly.

Please find the file attached.

Original comment by jon.hei...@gmail.com on 1 Oct 2012 at 4:59

Attachments:

issue.py

GoogleCodeExporter commented 9 years ago

Sorry, I cannot reproduce the bug.

I tested your code and it does work (attached is the sample pdf)

Are you using the latest version 1.7.1?

Original comment by reingart@gmail.com on 4 Oct 2012 at 4:45

Attachments:

SpanishWord.pdf

GoogleCodeExporter commented 9 years ago

I am using the 1.54b version that is in the pyfpdf zip. Running on a mac, so 
didn't pull down the win msi versions. Maybe the source version 1.7.1 will fix 
this.

Is the 1.54b zip not been updated or is the pyfpdf package different than the 
fpdf package?

Original comment by jhei...@realtaentertainment.com on 4 Oct 2012 at 9:42

GoogleCodeExporter commented 9 years ago

Please usa at least 1.7 version

http://pyfpdf.googlecode.com/files/fpdf-1.7.zip

pyfpdf is the old package name, now you just have to import fpdf
(previous name should work, but depends on how it is intalled)

See unicode.py tutorial for a full example:

http://code.google.com/p/pyfpdf/source/browse/tutorial/unicode.py

(if you use non-latin1 chars, you will need to grab the ttf font pack)

Original comment by reingart@gmail.com on 4 Oct 2012 at 11:32