JimmXinu / FanFicFare

FanFicFare is a tool for making eBooks from stories on fanfiction and other web sites.
Other
753 stars 161 forks source link

EPUB output is invalid: unique-identifier in OEBPS/content.opf doesn't point to an ID field. #8

Closed GoogleCodeExporter closed 9 years ago

GoogleCodeExporter commented 9 years ago
What steps will reproduce the problem?
1. Generate EPUB output, e.g., "python downaloder.py 
http://www.fanfiction.net/s/5782108/1/ epub".
2. Test it with epubcheck or http://threepress.org/document/epub-validate/. One 
of the errors will be: "ERROR: hprationality.epub/OEBPS/content.opf(3): bad 
value for attribute "unique-identifier"".

What is the expected output? What do you see instead?
The EPUB output is invalid. While it may work on some devices, it may fail on 
others.

What version of the product are you using? On what operating system?
I'm using current tip (26:54fc9b30ced5) on Python 2.6.4 (Ubuntu 9.10), plus the 
patches from issue 6 and issue 7 (which don't affect content.opf generation).

Please provide any additional information below.
There are several different issues causing validation to fail. This is one of 
them. See section 2.1 of the current OPF draft, and a lay explanation of what 
how the unique-identifier attribute is used:

http://www.idpf.org/doc_library/epub/OPF_2.0.1_draft.htm#Section2.1
http://netkingcol.blogspot.com/2010/01/closer-look-at-opf.html

The value of the "package" element's "unique-identifier" attribute is an IDREF, 
which points to an element with that ID. (It should, therefore, be constant for 
a format.) So, if we set it to "BookId", then the unique identifier for the 
EPUB will be the contents of an element setting an ID attribute to "BookId". 
(Previously, the unique-identifier was being randomly set, so it didn't point 
to anything; the element with 'BookId' as its ID had constant content.)

The attached patch does this, and changes the constants to fit.

Some resources suggest using an ISBN or the like for an ID, meaning that the 
unique-identifier is the same across all copies of a book. That's beyond the 
scope of this particular report, but Python's UUID module supports making UUIDs 
from URLs, which could very well be the easiest way to get a canonical UUID for 
a fic.

Original issue reported on code.google.com by adam.buc...@gmail.com on 16 Sep 2010 at 4:30

Attachments:

GoogleCodeExporter commented 9 years ago
This, or a similar change, has been incorporated.

Original comment by retiefj...@gmail.com on 16 Oct 2010 at 2:05