w3c / epubcheck

The conformance checker for EPUB publications
https://www.w3.org/publishing/epubcheck/
BSD 3-Clause "New" or "Revised" License
1.65k stars 402 forks source link

Error PKG-013 and RSC-005 when trying to produce a mulit-language book #1572

Closed Scal-Human closed 2 months ago

Scal-Human commented 2 months ago

Hello,

First, thank for your tool, it helps me in my personal artistic/computer-related projects.

I am trying to produce a dual language book which is explained in The rendition:language attribute as being composed of multiple opf root files with a different rendition:language attribute. I am using EpubChecker in my production stream (and it already helped me to correct the generator) to validate the result. With the following containe.xml:

<?xml version="1.0" encoding="utf-8"?>
<container version="1.0" xmlns="urn:oasis:names:tc:opendocument:xmlns:container"
  xmlns:rendition="http://www.idpf.org/2013/rendition">
  <rootfiles>
    <rootfile full-path="Content-en.opf" media-type="application/oebps-package+xml" rendition:language="en" />
    <rootfile full-path="Content-fr.opf" media-type="application/oebps-package+xml" rendition:language="fr" />
  </rootfiles>
</container>

EpubCheck (5.1.0) is complaining 3 times:

ERROR(PKG-013): C:/.../Book.epub/META-INF/container.xml(-1,-1): The EPUB file includes multiple OPS renditions.
ERROR(RSC-005): C:/.../Book.epub/META-INF/container.xml(5,111): Error while parsing file: found attribute "rendition:language", but no attributes allowed here
ERROR(RSC-005): C:/.../Book.epub/META-INF/container.xml(6,111): Error while parsing file: found attribute "rendition:language", but no attributes allowed here

I have to admit, I still did not decide whether I will publish separate languages or a mixed book, but I would like to have the possibility in my production stream.

Salutations, Scal

mattgarrish commented 2 months ago

Are you trying to create a multiple rendition EPUB 2 file by any chance? Multiple renditions are only valid in EPUB 3.

Scal-Human commented 2 months ago

@mattgarrish Hello, I am trying to produce an ePub 3, but now that you mention it, maybe the xml namespaces are the incorrect ones. I checked the W3 spec for ePub 3 and the version is 1.0 and Oasis namespace seems correct. The mimetype file contains "application/epub+zip" but the opf's contain version 2.0. I changed in "3.0" and get more errors (concerning other checks) but still have an error (another one):

WARNING(RSC-019): C:/.../Book.epub//C:/Study/PowerProject/ePubTest/Target/Book.epub(-1,-1): EPUBs with Multiple Renditions should contain a META-INF/metadata.xml file.
mattgarrish commented 2 months ago

The mimetype file contains "application/epub+zip" but the opf's contain version 2.0.

Yes, the 2.0 setting in the package document would be the reason why it's not allowing the rendition attributes. That indicates the epub version, not the information in the container.xml file. The 1.0 version is for the container document structure.

I changed in "3.0" and get more errors (concerning other checks) but still have an error (another one):

Right, you're supposed to include publication-wide metadata in the metadata.xml file - separate from the rendition-specific metadata that goes in each package document. It's not required, though, which is why it's only a warning.

The only thing I'd warn you is that you're probably going to find there's no support for multiple renditions in reading systems, so you might be disappointed in the result even if you get all the issues worked out. That's why multiple renditions were split out from EPUB 3 into a separate note. We weren't going to be able to show two interoperable implementations, so EPUB 3 wouldn't have made it through W3C process if multiple renditions were part of it.

Scal-Human commented 2 months ago

@mattgarrish Many thanks for your expertise. One last question (if I may) in this case, if I split books per language, I may be better to stick to ePub 2 as there is no duplication (of metadata) ? And v3 it is complaining about all the metadata (opf:role ...) in what I currenlty have generated.

Scal-Human commented 2 months ago

@mattgarrish I close the issue as you answered (with good advices). I have to review a bit my generator, but it is the goal and EpubCheck is there to help ;-) Thanks again

mattgarrish commented 2 months ago

I may be better to stick to ePub 2 as there is no duplication (of metadata) ?

If you're creating separate publications for each, then it doesn't matter which version of epub you use. The metadata.xml file only comes into play with multiple renditions in the same file. It avoids the problem of a reading system reading rendition-specific metadata from the first package document listed in the container.xml file.

And v3 it is complaining about all the metadata (opf:role ...) in what I currenlty have generated.

Right, there are a number of differences between EPUB 2 and 3. EPUB 3 doesn't use the opf: attributes for metadata anymore. You have to use the refines attribute.

If you want to migrate to epub 3, it's probably better to find a conversion program to do the basics than try and convert it by hand. I can't recommend one, but you should find various options if you search.

Scal-Human commented 1 month ago

@mattgarrish Many thanks for your advices and the link to refine attributes. I decided to go for Epub3 so I may skip the toc.ncx, and, as I said, epubcheck helped me a lot. FYI: I am in the process of rewritting the generation of my books (poetry in French but also in English and sci-fi tales). The process is based on dotnet (MSBuild) to automate the incremental generation based on markdown sources (I plan to publish the result on gihub). Working this way, a book is a project like any other project (C# Library, PowerShell module ...) with epubcheck at the test phase.