Open GoogleCodeExporter opened 8 years ago
Hi David, sorry for the delay, I've missed completely this issue (I'm
activating notifications to not miss an issue again)
If you want I can give you commit access so you can submit your code and then
we can review/merge it.
Original comment by reingart@gmail.com
on 3 Oct 2011 at 5:52
basic and experimental py3k support was added at rev c1a331646f42 and up
currently only text (builtin fonts) wokrs.
embed ttffonts and image support are not ported yet.
https://code.google.com/p/pyfpdf/source/detail?r=f866a1306719bc5a5c9a0d1ea38f5f0
f4278d5a8
Original comment by reingart@gmail.com
on 6 Aug 2012 at 8:12
Hello, there are simple support for compression in py3k
--- a/fpdf/fpdf.py Fri Aug 17 01:12:31 2012 -0300
+++ b/fpdf/fpdf.py Wed Dec 19 11:45:40 2012 +0400
@@ -1072,7 +1072,10 @@
self._out('endobj')
#Page content
if self.compress:
- p = zlib.compress(self.pages[n])
+ if PY3K:
+ p =
zlib.compress(self.pages[n].encode("latin-1")).decode("latin-1")
+ else:
+ p = zlib.compress(self.pages[n])
else:
p = self.pages[n]
self._newobj()
Original comment by romiq...@gmail.com
on 19 Dec 2012 at 7:52
Attachments:
Experimental unicode fonts support with Python3 (via 2to3)
Original comment by romiq...@gmail.com
on 19 Dec 2012 at 1:01
Attachments:
Thanks romiq.kh!
Could you explain some of your changes?
Why are you converting everithing to unicode?
I don't know if we should manage all internal strings as unicode, or use bytes
instead.
What do you think?
Is this working both in py2.x and py3k?
Did you test this with non latin1 characters? (there are some test available)
I'm not sure if self.pages[n].encode("latin-1") would break, or it just works
by chance / coincidence
I've made you commiter, if you want you can make a py3k branch to test it.
Original comment by reingart@gmail.com
on 19 Dec 2012 at 6:49
Thank for commiting rights
Main idea of patch set is "if we can't avoid huge unicode self._out handling in
FPDF class - convert to bytes later"
In other case it need to be:
1. Create self._out as bytestring - use b"" literals (impossible for 2.x)
2. No int-to-str function (so decode, format, encode)
Nt sure if massive PY3K using is fine.
> Is this working both in py2.x and py3k?
It should, at least with py2.7. I can't test 2.4 and complain about // operator.
> Did you test this with non latin1 characters? (there are some test available)
unifonts.py works with some fonts, see attachment
btw, please fix HelloWorld.txt: "Russian: Здравствуй, мир"
> I'm not sure if self.pages[n].encode("latin-1") would break, or it just works
by chance / coincidence
No coincendace. Just dirty hack:
Pages array contains unicode strings. But there are charters [0..255].
I other side zlib.compress requires bytestring.
1. self.pages[n].encode("latin-1") - unicode -> bytes
2. zlib.compress() - bytes -> compressed bytes
3. .decode("latin-1") - bytes -> unicode
Last conversion unicode -> bytes made by
if PY3K:
# TODO: proper unicode support
f.write(self.buffer.encode("latin1"))
and this is not mine patch :)
Original comment by romiq...@gmail.com
on 19 Dec 2012 at 7:34
Attachments:
> Nt sure if massive PY3K using is fine.
I don't either, please create a function so we can refactor it on the future if
we need
>> Is this working both in py2.x and py3k?
> It should, at least with py2.7. I can't test 2.4 and complain about //
operator.
2.4 is too old, 2.5 compatibility would be great, 2.7 is mandatory
>> Did you test this with non latin1 characters? (there are some test available)
> unifonts.py works with some fonts, see attachment
can you download the unicode font pack and test all the fonts?
(to test this patch doesn't break anything)
> btw, please fix HelloWorld.txt: "Russian: Здравствуй, мир"
go ahead ;-)
sorry, I only speaks Spanish and a little of English, so that sure is a
inaccurate google translation...
>> I'm not sure if self.pages[n].encode("latin-1") would break, or it just
works by chance / coincidence
> No coincendace. Just dirty hack:
> Pages array contains unicode strings. But there are charters [0..255].
Yes, it is a dirty hack, but it is the best we have so far ;-)
> Last conversion unicode -> bytes made by
> if PY3K:
> # TODO: proper unicode support
> f.write(self.buffer.encode("latin1"))
> and this is not mine patch :)
Yes, as long it doesn't break existing code, I think we can live with that :-)
Fell free to address this comments, commit and close this issue by now, so
pyfpdf can finally be py3k compatible!
Original comment by reingart@gmail.com
on 21 Dec 2012 at 2:31
[deleted comment]
Hi all,
In py3k branch was successfully pass
3.2 2.7
py3k.py (with compression) + -
unifonts.py + +
ex_unicode + +
html.py - +
nb_pages.py + +
to be continue...
Original comment by romiq...@gmail.com
on 22 Dec 2012 at 5:51
Hi, all.
During porting to python3 was found round issue:
Font lohit_hi.ttf,
Charter 247, 832 * 0.9765625 -> 812.5 (ttfonts.py, line 868)
Python 2.7 round(812.5) -> 813.0
Python 3.2 round(812.5) -> 812.0
This prevent PDF generation byte-to-byte equals generation.
Have anyone some proposes?
Original comment by romiq...@gmail.com
on 25 Dec 2012 at 12:10
Solved, byte accuracy with all set (93 ttf) achieved.
Test will be pushed soon.
Now sha1(pdf_1.7) == sha1(pdf py2.7) == sha1(pdf py3.2)
Next html generating, there are import name conflict due HTMLParse renaming
with html.parse, which conflicts with html.py file (and html.py test).
Original comment by romiq...@gmail.com
on 25 Dec 2012 at 8:44
What about six package dependency for both 2.x and 3.x code?
http://packages.python.org/six/
This solve some issues with ugly code (especially for PNG loading code).
Original comment by romiq...@gmail.com
on 8 Jan 2013 at 12:20
I'd to do a quick & dirty py3k conversion, and as of today, it is almost
working in the default branch.
Sorry, I forgotten almost completly your py3k and as I needed to get images
working, I did some hacks to get complete py3k support.
We should release 1.7.2 and then you should try to merge your py3k branch,
specially rounding issues and final byte handling in _out et al, and then
release 1.7.3
Great work romiq.kh!
Original comment by reingart@gmail.com
on 5 Feb 2014 at 5:02
Hello, it looks like Python 3 mainly works (using current default branch), but
I have been looking at fixing a couple of minor issues. I am planning to go
through Roman’s py3k branch and pull out any useful changes that are still
relevant. (Most of them I think are already applied or no longer needed.)
Do you think it would be okay to drop Python 2.5 support? Python 2.6 allows the
b". . ." byte string syntax, and I would like to avoid the temporary Latin-1
encoding hack. So instead of code like this:
self._out(sprintf('/MediaBox [0 0 %.2f %.2f]',w_pt,h_pt))
self._out('>>')
self._out('endobj')
it looks like this:
self._out(sprintf('/MediaBox [0 0 %.2f %.2f]',w_pt,h_pt))
self._out(b'>>')
self._out(b'endobj')
with sprintf() calling encode('latin1') behind the scenes.
Original comment by vadm...@gmail.com
on 4 Jan 2015 at 9:37
Hi, vadmium. As for py3k branch it is true. It doesn't relevant for any usefull
ways.
Last good things - use struct.pack instead of bitshifting. All another just
hack for 3k.
As for drop 2.5 support - my vote "still not sure, can we wait another year?".
Original comment by romiq...@gmail.com
on 4 Jan 2015 at 8:23
Okay, I’ll try ensure my changes are compatible with 2.5. I might end up
leaving the Latin-1 hack there for the time being.
I think you are right about the py3k branch. The only things I can see are:
* The struct packing that you mentioned
* An error message formatting fix; I will make a pull request for these two
when I have a chance
* Part of the rounding for TTF font metrics
<https://code.google.com/p/pyfpdf/source/detail?r=97f2002af77a>. I think the
part in fpdf.py is not applied in the default branch. But maybe it is better to
just keep the code simple, than to make the rounding the same in this corner
case.
Original comment by vadm...@gmail.com
on 5 Jan 2015 at 3:50
Thanks for mention TTF round patch, this should be covered by some test. I'll
propose some patch for this later. This issue appeared only with some fonts.
Original comment by romiq...@gmail.com
on 5 Jan 2015 at 7:41
I agree wih romiq.kh, I still need Python 2.5 support for some of my customers.
Also, I think byte prefix doesn't add anything much more useful, and in fact it
could be error prone as bytes array still doen't formatting via placeholders
(and maybe other string operations).
The latin1 hack is a nasty one, but it helps to maintain the code clean,
consistent and compatible, and also it should not imply any serious performance
penalty (IIRC romiq.kh had pointed that out in previous comments).
Original comment by reingart@gmail.com
on 5 Jan 2015 at 5:24
Original issue reported on code.google.com by
firefigh...@gmail.com
on 17 May 2011 at 6:35