readium / readium-js

EPUB processing engine written in Javascript
BSD 3-Clause "New" or "Revised" License
369 stars 108 forks source link

Readium performance is very poor loading large ePub with hundreds of SMIL files. #181

Open roomisalah opened 7 years ago

roomisalah commented 7 years ago

Readium performance is very poor loading large ePub with hundreds of SMIL files. It is taking more than 18-20 seconds to show the first page which is not acceptable in the web-world.

After debugging, it has been noticed that Readium itself loads all the SMIL file at once and waits until all the SMIL file is loaded before showing up the first page.

Is there is any setting/way to deferred these calls and load SMIL data one by one after the page gets loaded? This way we can improve the performance.

I would appreciate if someone suggests improving the Readium performance?

Thanks in advance!

Below is the code snippet which loads all the SMIL data: this.fillSmilData = function (callback) { var that = this;

if (packageDocument.spineLength() <= 0) {
    callback();
    return;
}

var allFakeSmil = true;
var mo_map = [];
var parsingDeferreds = [];

for (var spineIdx = 0; spineIdx < packageDocument.spineLength(); spineIdx++) {
    var spineItem = packageDocument.getSpineItem(spineIdx);

    if (spineItem.media_overlay_id) {
        var manifestItemSMIL = packageDocument.manifest.getManifestItemByIdref(spineItem.media_overlay_id);

        if (!manifestItemSMIL) {
            console.error("Cannot find SMIL manifest item for spine/manifest item?! " + spineItem.media_overlay_id);
            continue;
        }
        //ASSERT manifestItemSMIL.media_type === "application/smil+xml"

        var parsingDeferred = $.Deferred();
        parsingDeferred.media_overlay_id = spineItem.media_overlay_id;
        parsingDeferreds.push(parsingDeferred);
        var smilJson = {};

        // Push the holder object onto the map early so that order isn't disturbed by asynchronicity:
        mo_map.push(smilJson);

        // The local parsingDeferred variable will have its value replaced on next loop iteration.
        // Must pass the parsingDeferred through async calls as an argument and it arrives back as myDeferred.
        that.parse(spineItem, manifestItemSMIL, smilJson, parsingDeferred, function (myDeferred, smilJson) {
            allFakeSmil = false;
            myDeferred.resolve();
        }, function (myDeferred, parseError) {
            //console.log('Error when parsing SMIL manifest item ' + manifestItemSMIL.href + ':');
            //console.log(parseError);
            myDeferred.resolve();
        });
    } else {
        mo_map.push(makeFakeSmilJson(spineItem));
    }
}

$.when.apply($, parsingDeferreds).done(function () {
    packageDocument.getMetadata().setMoMap(mo_map);
    if (allFakeSmil) {
        // console.log("No Media Overlays");
        packageDocument.getMetadata().setMoMap([]);
    }
    callback();
});

}

harishag commented 7 years ago

We are all facing this issue. Any help in this area will be helpful. Load time for large books with SMIL is very high.

danielweck commented 7 years ago

There are no plans to implement SMIL lazy-loading (ReadiumJS and ReadiumSDK/C++). However, this would certainly improve initial-load performance (possibly at the detriment of playback-time performance, but this could be mitigated with some look-ahead techniques).

Things to consider:

There might be functions in the Media Overlays engine (in readium-shared-js) that depend on having access to the entire SMIL timing tree across multiple spine items (I'm not sure, from the top of my head). I think there are also calculations (for total durations, IIRC) performed during full-publication parsing. Again, to be verified, I'm not sure anymore.

Although the loaded EPUB publications are passed as a monolithic JSON dataset to readium-shared-js, the exception to this rule of thumb is the table of contents (EPUB2 NCX / EPUB3 Navigation Document), which is loaded separately by the viewer instance. In other words, there is a precedent with regards to fetching resources from the EPUB container after the main JSON is generated. This model could be followed for the SMIL files too.