avdi / quarto

MIT License
469 stars 29 forks source link

Can't generate Kindle files due to invalid OPF file #23

Closed jf647 closed 9 years ago

jf647 commented 10 years ago

This isn't strictly a Quarto issue, but as we're driving Pandoc (for now), perhaps there is some workaround that can be done for Kindle.

The version of Pandoc available from the Ubuntu repositories is too old to use out of the box with Quarto - it complained about one of the --epub-* command line switches being invalid, though right now I forget which.

I installed haskell-platform then installed the latest pandoc (1.12.2.1) using cabal, and now I can generate EPUB files, but when I go to run that through Kindlegen, I get this error:

kindlegen build/deliverables/test-book.epub -o test-book.mobi

*************************************************************
 Amazon kindlegen(Linux) V2.9 build 0730-890adc2
 A command line e-book compiler
 Copyright Amazon.com and its Affiliates 2013
*************************************************************

Error(opfparser):E20006: There are more than one title defined in OPF metadata. But none of them is refined with "title-type" as "main" title. Refer http://idpf.org/epub/30/spec/epub30-publications.html#sec-opf-dctitle for more info.
root@9895edd55b28:/quarto#

And looking at the content.opf that pandoc generated, we can see that there are two titles:

<?xml version="1.0" encoding="UTF-8"?>
<package xmlns="http://www.idpf.org/2007/opf" version="3.0" unique-identifier="epub-id-1">
  <metadata xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:opf="http://www.idpf.org/2007/opf">
    <dc:identifier id="epub-id-1">urn:uuid:47cede43-5895-4d95-94f9-77f86b68645a</dc:identifier>
    <dc:title id="epub-title-1">Test Book</dc:title>
    <dc:title id="epub-title-2">Test Book</dc:title>
    <dc:date>2013-12-27T17:00:20Z</dc:date>
    <dc:language>en</dc:language>
    <dc:creator id="epub-creator-1">James FitzGibbon</dc:creator>
    <dc:description>A Book to Test Quarto</dc:description>

The linked page does indicate how to fix this. If I run 'rake epub' then edit the content.opf file to add a meta property (or remove the second title), I can then run 'rake kindlegen', so this is fixable via post-processing of the pandoc output:

<?xml version="1.0" encoding="UTF-8"?>
<package xmlns="http://www.idpf.org/2007/opf" version="3.0" unique-identifier="epub-id-1">
  <metadata xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:opf="http://www.idpf.org/2007/opf">
    <dc:identifier id="epub-id-1">urn:uuid:ea23e9ba-b136-4cd7-8d95-c4e20729dd87</dc:identifier>
    <dc:title id="epub-title-1">Test Book</dc:title>
    <meta refines="#epub-title-1" property="title-type">main</meta>
    <dc:title id="epub-title-2">Test Book</dc:title>
    <dc:date>2013-12-27T17:14:08Z</dc:date>
    <dc:language>en</dc:language>
    <dc:creator id="epub-creator-1">James FitzGibbon</dc:creator>

I'm not sure how close quarto is to dropping pandoc for GEPUB for the generation step. If that's not a "Real Soon Now(tm)" thing, I can whip up a PR that adds another cleanup inside of Quarto::PandocEpub::define_tasks to remove all but the first title element from the metadata.

avdi commented 10 years ago

As eager as I am to switch away from Pandoc, even if I started tomorrow it would take awhile to complete. So Yes, Please on the PR :-)

On Fri, Dec 27, 2013 at 12:16 PM, James FitzGibbon <notifications@github.com

wrote:

This isn't strictly a Quarto issue, but as we're driving Pandoc (for now), perhaps there is some workaround that can be done for Kindle.

The version of Pandoc available from the Ubuntu repositories is too old to use out of the box with Quarto - it complained about one of the --epub-* command line switches being invalid, though right now I forget which.

I installed haskell-platform then installed the latest pandoc (1.12.2.1) using cabal, and now I can generate EPUB files, but when I go to run that through Kindlegen, I get this error:

kindlegen build/deliverables/test-book.epub -o test-book.mobi


Amazon kindlegen(Linux) V2.9 build 0730-890adc2 A command line e-book compiler Copyright Amazon.com and its Affiliates 2013


Error(opfparser):E20006: There are more than one title defined in OPF metadata. But none of them is refined with "title-type" as "main" title. Refer http://idpf.org/epub/30/spec/epub30-publications.html#sec-opf-dctitle for more info. root@9895edd55b28:/quarto#

And looking at the content.opf that pandoc generated, we can see that there are two titles:

<?xml version="1.0" encoding="UTF-8"?>

urn:uuid:47cede43-5895-4d95-94f9-77f86b68645a/dc:identifier Test Book/dc:title Test Book/dc:title dc:date2013-12-27T17:00:20Z/dc:date dc:languageen/dc:language James FitzGibbon/dc:creator dc:descriptionA Book to Test Quarto/dc:description The linked page does indicate how to fix this. If I run 'rake epub' then edit the content.opf file to add a meta property (or remove the second title), I can then run 'rake kindlegen', so this is fixable via post-processing of the pandoc output: urn:uuid:ea23e9ba-b136-4cd7-8d95-c4e20729dd87/dc:identifier Test Book/dc:title main Test Book/dc:title dc:date2013-12-27T17:14:08Z/dc:date dc:languageen/dc:language James FitzGibbon/dc:creator I'm not sure how close quarto is to dropping pandoc for GEPUB for the generation step. If that's not a "Real Soon Now(tm)" thing, I can whip up a PR that adds another cleanup inside of Quarto::PandocEpub::define_tasks to remove all but the first title element from the metadata. — Reply to this email directly or view it on GitHubhttps://github.com/avdi/quarto/issues/23 .

Avdi Grimm http://avdi.org

I only check email twice a day. to reach me sooner, go to http://awayfind.com/avdi