fatty- / daisy-pipeline

Automatically exported from code.google.com/p/daisy-pipeline
0 stars 0 forks source link

daisy202-to-epub3 URI encoding problem when there are spaces in the filename #304

Closed GoogleCodeExporter closed 9 years ago

GoogleCodeExporter commented 9 years ago
See attached input and output.

Notice that the input book has "the content.xhtml" as its content document.

The output book, however, has on disk "the%20content.xhtml" as its content doc. 
The package file entry is:

<item href="Content/the%20content.xhtml" id="item_1" media-overlay="item_6" 
media-type="application/xhtml+xml"/>

Which would be fine if the filename still had a space in it, but since it's now 
been changed on disk to "the%20content.xhtml", the correct URI would be 
"the%2520content.xhtml".

Even better would be one of these possible solutions:

* escape the filename in the XML attributes but leave the spaces on disk
* use unescaped spaces in the XML attrs and leave the spaces on disk
* remove spaces from filenames

Note that EPUBCheck issues a warning for filenames with spaces.

I don't know to what extent problems like this are prevalent in Pipeline 
scripts; I've only checked Daisy 202 to EPUB3.

Here is the full output from EPUBCheck:

Epubcheck Version 3.0

Validating against EPUB version 3.0
WARNING: ..epub/EPUB/Content/the content.xhtml: Filename contains spaces. 
Consider changing filename such that URI escaping is not necessary
WARNING: ..epub/EPUB/Content/the content.smil: Filename contains spaces. 
Consider changing filename such that URI escaping is not necessary
ERROR: ..epub: OPS/XHTML file EPUB/Content/the content.xhtml is missing
ERROR: ..epub: File EPUB/Content/the content.smil is missing
ERROR: ..epub/EPUB/Content/ncx.xml(5,172): 'ch1': fragment identifier is not 
defined in 'EPUB/Content/the content.xhtml'
ERROR: ..epub/EPUB/Content/ncc.xhtml(7,110): 'ch1': fragment identifier is not 
defined in 'EPUB/Content/the content.xhtml'
WARNING: ..epub: item (EPUB/Content/the%20content.smil) exists in the zip file, 
but is not declared in the OPF file
WARNING: ..epub: item (EPUB/Content/the%20content.xhtml) exists in the zip 
file, but is not declared in the OPF file

Check finished with warnings or errors

Original issue reported on code.google.com by marisa.d...@gmail.com on 26 Apr 2013 at 1:11

Attachments:

GoogleCodeExporter commented 9 years ago

Original comment by marisa.d...@gmail.com on 26 Apr 2013 at 1:12

GoogleCodeExporter commented 9 years ago

Original comment by josteinaj@gmail.com on 26 Apr 2013 at 7:33

GoogleCodeExporter commented 9 years ago
The issue is that zip entry names provided to px:zip must not contain percent 
encoded characters.

Pull request containing fix: 
https://github.com/daisy-consortium/pipeline-modules-common/pull/16

Original comment by josteinaj@gmail.com on 26 Apr 2013 at 8:55

GoogleCodeExporter commented 9 years ago

Original comment by josteinaj@gmail.com on 20 Jun 2013 at 2:05