dquangsinh / fb2pdf

Automatically exported from code.google.com/p/fb2pdf
0 stars 0 forks source link

PDF info generation #6

Closed GoogleCodeExporter closed 8 years ago

GoogleCodeExporter commented 8 years ago
Currently PDF info generation is commented in code (look for 'pdfinfo').
This was due to incorrect display of non-ascii chars. Perhaps same trick as
we do for TOC UTF8 encoding would work here?

Original issue reported on code.google.com by kroko...@gmail.com on 25 Feb 2007 at 5:52

GoogleCodeExporter commented 8 years ago
The PDF TOC trick as is doesn't work (I tried it yesterday).

Original comment by sudarkoff on 25 Feb 2007 at 7:39

GoogleCodeExporter commented 8 years ago

Original comment by kroko...@gmail.com on 8 Mar 2007 at 4:59

GoogleCodeExporter commented 8 years ago

Original comment by kroko...@gmail.com on 9 Mar 2007 at 5:32

GoogleCodeExporter commented 8 years ago
I wonder if at all it is possible to have UTF-8 in PDF info? Anybody have seen 
PDF
file with russian info?

If I look into PDF file I can see:

/Author()/Title()/Subject()/Creator(LaTeX with hyperref
package)/Producer(pdfeTeX-1.21a)/Keywords()
/CreationDate (D:20070313221053-07'00')

I guess we can do a small python hack, adding author and title to already 
generated
PDF document file.

Original comment by kroko...@gmail.com on 14 Mar 2007 at 5:37

GoogleCodeExporter commented 8 years ago
since eBook does not show cyrrilic from PDF info we will need to transliterate.
First step to develop transliteration function per:

http://www.gsnti-norms.ru/norms/common/doc.asp?0&/norms/stands/7_79.htm

Original comment by kroko...@gmail.com on 14 Mar 2007 at 5:09

GoogleCodeExporter commented 8 years ago
we could use 'pytils' module
http://cheeseshop.python.org/pypi/pytils
http://code.google.com/p/pythy
The library under GPL

Here is an example:

>>> import pytils
>>> pytils.translit.translify(u"Вадим Залива")
'Vadim Zaliva'
>>> pytils.translit.translify(u"Александр Сова")
'Aleksandr Sova'

Original comment by alex.s...@gmail.com on 14 Mar 2007 at 9:11

GoogleCodeExporter commented 8 years ago
what algorithm do they use?

Original comment by kroko...@gmail.com on 14 Mar 2007 at 9:23

GoogleCodeExporter commented 8 years ago
FYI: http://en.wikipedia.org/wiki/Transliteration_of_Russian_into_English

Original comment by kroko...@gmail.com on 14 Mar 2007 at 9:23

GoogleCodeExporter commented 8 years ago
It's their own but very close to GOST 16876 (1971). These standarts are useless 
for 
us because all of them (except that 30-year old GOST) use diacritics. I do not 
think 
we want that. 
I like the algorithm implemented by 'pytils'better then GOST. it is closer to 
what 
is actually used now. 

Original comment by alex.s...@gmail.com on 14 Mar 2007 at 10:00

GoogleCodeExporter commented 8 years ago
vberko just opened my eyes to new issue: SonyReaders which have been "russified"
(flashed with russian fonts) might show russian letters correctly in PDF info.
In this case we need to have an option controlling whenever generate russian on
transliterated PDF info.

Here is link about using cyrrilic in PDF meta info:

http://www.mccme.ru/free-books/p_cher.htm

Original comment by kroko...@gmail.com on 15 Mar 2007 at 12:32

GoogleCodeExporter commented 8 years ago
for now calls to pytils.translit.translify() added. In future we will
have an option to specify whenever to use translit or try to generate
russian info.

Original comment by kroko...@gmail.com on 18 Mar 2007 at 4:26

GoogleCodeExporter commented 8 years ago
last workaround from trivee does not completely fix this problem. To quote him:

На самом деле надо сделать не так. То, как 
сейчас - в Sony Reader
работает, в Acrobat Reader  не работает. Чтобы 
починить правильно,
надо в пакет hyperref передать несколько опций 
(pdfauthor, pdftitle),
или не использовать hyperref.

Original comment by kroko...@gmail.com on 19 Mar 2007 at 5:47

GoogleCodeExporter commented 8 years ago
2007-03-22 11:42:40,269 DEBUG    Downloading 'http://s3.amazonaws.com/fb2pdf/fe7
1505fd3f7dcf3c4de7cfb6ff4c209.fb2' to file 'sbornik_receptov__ryba_carica_stola.
fb2'.

2007-03-22 11:42:43,094 ERROR    Unknown Processing Error - treating as Persiste
nt
Traceback (most recent call last):
  File "/usr/bin/fbdaemon", line 391, in main
    processMessage(m)
  File "/usr/bin/fbdaemon", line 175, in processMessage
    callbacks)
  File "/usr/bin/fbdaemon", line 306, in processDocument
    fb2tex.fb2tex(fbfilename, texfilename)
  File "/usr/lib/python2.4/site-packages/fb2pdf/fb2tex.py", line 308, in fb2tex
    processDescription(find(fb,"description"), f)
  File "/usr/lib/python2.4/site-packages/fb2pdf/fb2tex.py", line 627, in process
Description
    _uwrite(f,pytils.translit.translify(title))
  File "/usr/lib/python2.4/site-packages/pytils/translit.py", line 171, in trans
lify
    raise ValueError("Unicode string doesn't transliterate completely, " + \
ValueError: Unicode string doesn't transliterate completely, is it russian?

Original comment by kroko...@gmail.com on 23 Mar 2007 at 7:37

GoogleCodeExporter commented 8 years ago
looks like the title string we got is not a unicode string.
The document is in windows-1251 encoding. The same string works just fine with 
manual input.
I'll play with it more

Original comment by alex.s...@gmail.com on 24 Mar 2007 at 8:13

GoogleCodeExporter commented 8 years ago
looks like the title string we got is not a unicode string.
The document is in windows-1251 encoding. The same string works just fine with 
manual input.
I'll play with it more

Original comment by alex.s...@gmail.com on 24 Mar 2007 at 8:18

GoogleCodeExporter commented 8 years ago
2007-04-15 00:33:51,927 DEBUG    Downloading 'http://s3.amazonaws.com/fb2pdf/d91
ff3289ed39d50339eb0d1cea6fbc5.fb2' to file 'sevela_yefraim_monya_cackes_znamenos
ec.fb2'.

2007-04-15 00:33:53,738 ERROR    Unknown Processing Error - treating as Persiste
nt
Traceback (most recent call last):
  File "/usr/bin/fbdaemon", line 391, in main
    processMessage(m)
  File "/usr/bin/fbdaemon", line 175, in processMessage
    callbacks)
  File "/usr/bin/fbdaemon", line 306, in processDocument
    fb2tex.fb2tex(fbfilename, texfilename)
  File "/usr/lib/python2.4/site-packages/fb2pdf/fb2tex.py", line 321, in fb2tex
    processDescription(find(fb,"description"), f)
  File "/usr/lib/python2.4/site-packages/fb2pdf/fb2tex.py", line 640, in process
Description
    _uwrite(f,pytils.translit.translify(title))
  File "/usr/lib/python2.4/site-packages/pytils/translit.py", line 171, in trans
lify
    raise ValueError("Unicode string doesn't transliterate completely, " + \
ValueError: Unicode string doesn't transliterate completely, is it russian?

Original comment by kroko...@gmail.com on 15 Apr 2007 at 7:37

GoogleCodeExporter commented 8 years ago
fixed using code snippet sent by bird

Original comment by kroko...@gmail.com on 23 Apr 2007 at 4:40