Kobo device removes ImageId on first restart

jgoguen commented 11 years ago

The first time the Kobo sees a new book, even with existing DB rows, it starts its usual processing and strips out series information and the image ID. If there's a way to stop the Kobo from processing new files, it needs to be implemented and tested to make sure it doesn't break anything.

This is not considered to be a major issue since plugging the device into calibre and allowing it to update the metadata will set the series data and image ID and the Kobo device won't remove it again.

giorgio130 commented 11 years ago

I wanted to point out there's an easier way to get covers to show on kepub: in the OPF file, the line describing the cover has to be like this: item href="Images/cover.jpg" id="cover.jpg" media-type="image/jpeg" properties="cover-image"

i.e. the string properties="cover-image" has to be added in order for the reader to pick up the cover and process it. I think adding this to the converting code will simplify a bit things, since you won't have to add rows to the database relating to cover and such.

jgoguen commented 11 years ago

Awesome, thanks for that info. I'll see about getting that in there next chance I get. Probably this weekend.

jgoguen commented 11 years ago

Per http://www.mobileread.com/forums/showpost.php?p=2402644&postcount=39 and http://idpf.org/epub/30/spec/epub30-publications.html#sec-opf-dctitle the EPUB3 spec has the ability to set series information by defining a title-type meta tag set to collection and a group-position meta tag set to the series index. It appears that there would be no harm in setting these extra tags in an EPUB2 document and if the Kobo device reads these tags even for an EPUB2 document the database code would be unnecessary entirely. Something to investigate in the next few days.

giorgio130 commented 11 years ago

First patch from me ;)

Your code assumes the id of the cover image is "cover", this is not always the case. This checks also the id specified in the metadata:

diff --git a/driver.py b/driver.py
index 2494872..3f7d849 100644
--- a/driver.py
+++ b/driver.py
@@ -132,6 +132,15 @@ class KOBOTOUCHEXTENDED(KOBOTOUCH):
            return False

        opf = container.get_parsed(container.opf_file)
+       for node in opf.xpath('./ns:metadata/ns:meta[@name="cover"]', namespaces = {"ns": container.opf_ns}):
+           cover_id = node.attrib["content"]
+       for node in opf.xpath('./ns:manifest/ns:item', namespaces = {"ns": container.opf_ns}):
+           if node.attrib["id"]==cover_id and ("properties" not in node.attrib or node.attrib["properties"] != 'cover-image'):
+               print("KoboTouchExtended:_modify_epub:Setting cover-image")
+               node.set("properties", "cover-image")
+               container.set(container.opf_file, opf)
+               changed = True
+
        for node in opf.xpath('./ns:manifest/ns:item[@id="cover"]', namespaces = {"ns": container.opf_ns}):
            if "properties" not in node.attrib or node.attrib["properties"] != 'cover-image':
                debug_print("KoboTouchExtended:_modify_epub:Setting cover-image")

It seems to work with most or all of my ebooks, while before I got the cover to show for only about half of them.

jgoguen commented 11 years ago

Someone got around to testing the EPUB3 series metadata before I did and reported that it does not work.

giorgio130 commented 11 years ago

Some more issues on this code:

calibre, version 0.9.17 ERRORE: Errore: Errore di comunicazione col dispositivo

'list' object has no attribute 'attrib'

Traceback (most recent call last): File "site-packages/calibre/gui2/device.py", line 85, in run File "site-packages/calibre/gui2/device.py", line 551, in _upload_books File "calibre_plugins.kobotouch_extended.driver", line 202, in upload_books File "calibre_plugins.kobotouch_extended.driver", line 139, in _modify_epub AttributeError: 'list' object has no attribute 'attrib'

affected opf: http://dl.dropbox.com/u/29092891/content.opf

jgoguen commented 11 years ago

I've pushed up a new copy of the code, the fixes are unrelated to this (I think) but the change I made should make it impossible to reference a list object at that point.

By the way, watch out for the removal of the no-database branch. The database manipulation code was removed from master, so no-database is no longer needed.

giorgio130 commented 11 years ago

I didn't check right away when you committed your version of my patch, however current code can't detect covers that were correctly tagged with my code. I don't know where it went wrong.

jgoguen commented 11 years ago

Can you send me some OPF files from books that were working with your patch but not when I modified it?

giorgio130 commented 11 years ago

this is an example:

<?xml version="1.0" encoding="UTF-8" standalone="yes" ?>
<package xmlns="http://www.idpf.org/2007/opf" unique-identifier="BookID" version="2.0">
    <metadata xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:opf="http://www.idpf.org/2007/opf">
        <dc:identifier id="BookID" opf:scheme="UUID">A4D65303-C14E-420C-BF55-121B49B49034</dc:identifier>
        <dc:contributor opf:role="bkp">Created with writer2epub 1.1.20 by Luca Calcinai http://lukesblog.it/writer2epub</dc:contributor>
        <dc:date opf:event="creation">2013-02-08</dc:date>
        <dc:creator opf:role="aut">Author</dc:creator>
        <dc:description>
        <![CDATA[Description]]>
        </dc:description>
        <dc:language>it</dc:language>
        <dc:source>Title</dc:source>
        <dc:publisher>Autoprodotto</dc:publisher>
        <dc:date opf:event="publication">2013-02-08</dc:date>
        <dc:title>Title</dc:title>
        <meta content="1.1.20" name="writer2epub"/>
        <meta content="immagini1.jpg" name="cover"/>
        <meta content="0.6.2" name="Sigil version"/>
        <dc:date opf:event="modification">2013-02-08</dc:date>
    </metadata>
    <manifest>
        <item href="toc.ncx" id="ncx" media-type="application/x-dtbncx+xml"/>
        <item href="Images/immagini1.jpg" id="immagini1.jpg" media-type="image/jpeg"/>
        <item href="Styles/style001.css" id="style001.css" media-type="text/css"/>
        <item href="Text/content0001.xhtml" id="content0001.xhtml" media-type="application/xhtml+xml"/>
        <item href="Text/content0002.xhtml" id="content0002.xhtml" media-type="application/xhtml+xml"/>
        <item href="Text/content0003.xhtml" id="content0003.xhtml" media-type="application/xhtml+xml"/>
        <item href="Text/content0004.xhtml" id="content0004.xhtml" media-type="application/xhtml+xml"/>
        <item href="Text/content0005.xhtml" id="content0005.xhtml" media-type="application/xhtml+xml"/>
        <item href="Text/content0006.xhtml" id="content0006.xhtml" media-type="application/xhtml+xml"/>
    </manifest>
    <spine toc="ncx">
        <itemref idref="content0001.xhtml"/>
        <itemref idref="content0002.xhtml"/>
        <itemref idref="content0003.xhtml"/>
        <itemref idref="content0004.xhtml"/>
        <itemref idref="content0005.xhtml"/>
        <itemref idref="content0006.xhtml"/>
    </spine>
    <guide>
        <reference href="Text/content0001.xhtml" title="Cover Page" type="cover"/>
    </guide>
</package>

jgoguen commented 11 years ago

That's an odd one. It looks very straightforward, the cover is clearly identified and the specified ID is in the manifest as an image/jpeg. I'll run this one through next time I'm in front of my computer.

giorgio130 commented 11 years ago

Did you change anything related to this? Detection seems to be working fine now, for some reason.

jgoguen commented 11 years ago

I haven't even looked at this yet, I've been distracted by some other things.

giorgio130 commented 11 years ago

Well this very example is now being properly detected. Maybe the chardet changes are involved?

jgoguen commented 11 years ago

You know, that's possible. The XML files are fetched the same way as everything else so maybe the chardet changes did actually fix this as well.

jgoguen commented 11 years ago

The imageID is solved, the rest can be a new bug.

jgoguen / calibre-kobo-driver

Kobo device removes ImageId on first restart #1