readium / readium-shared-js

Repository for the shared JavaScript libraries that are used in the SDK-Launchers and other applications developed on top of the SDK
BSD 3-Clause "New" or "Revised" License
78 stars 101 forks source link

absolute resource URLs are handled as relative (EPUB3 external media) #226

Closed danielweck closed 9 years ago

danielweck commented 9 years ago

See resolveRelativeUrl() https://github.com/readium/readium-shared-js/blob/develop/js/models/package.js#L79

danielweck commented 9 years ago

Fixed here: https://github.com/readium/readium-shared-js/commit/789169c14021296e913c1a6d6a79f42d4440dfc4 (URL is considered "absolute" when a scheme / protocol is present in its syntax)

pmstss commented 9 years ago

(URL is considered "absolute" when a scheme / protocol is present in its syntax)

check with "is" query (http://medialize.github.io/URI.js/docs.html#constructor) i.e. is('relative') will do job more straightforward - or there are some caveats here? also new keyword can be omitted and then check syntax will be bit more elegant URI(relativeUrl).is('relative')

danielweck commented 9 years ago

Thanks for chiming in @pmstss I totally agree, in fact this is why I commented explicitly in the description (for future reference, as we might need to actually check / use the .scheme()).

Additional remarks: I keep coming across path normalization issues, URL handling discrepancies, etc. in the Readium codebase. For example sometimes path concatenation is done "manually", sometimes using the URI.toAbsolute() API. I went through a similar experience in the Readium SDK, where I fixed a number of issues related to Unicode handling as well as GURL usage (this is a performance-sensitive area, as paths manipulations are performed often, at some key stages of EPUB parsing / processing). Because of the levels of indirection in an EPUB container (manifest item paths are first rooted to the OPF package, then based relative to SMIL files, XHTML files, etc.), it is absolutely essential that paths are normalized before comparison / matching (URL escape codes, "." and ".." sequences, redundant "/", etc.). Most EPUBs are created from predictable templates, so most of the time things just appear to work fine. Then, one day we come across an EPUB that doesn't conform to the usual pattern, and things break...and we fix it in a ad-hoc manner (recently, manifest items starting with "./" !). I am actually in the middle of conducting a broader review of path usage, notably in readium-js where there are confusing statements package vs. container (OPF-rooted files, vs. paths based on ZIP filesystem root), which makes the code hard to understand (ZIP iframe loade + publication / resource fetchers). Sorry for the long description, but I thought you'd be interested. Comments welcome!

bluefirepatrick commented 9 years ago

@danielweck, this is something that we have noticed as well: a book created with some generic template that only used relative paths to differentiate resources could easily break things. It's on our list of things to fix but it not front-burner for us right now. We do have some ideas around it though so if you are wanting to tackle this we would be happy to share them.

pmstss commented 9 years ago

Thank you @danielweck for details about path issues, i have just added path-related pull request to readium-js - additional confirmation that path handling sometimes is tricky.

Initial objection was that code sometimes is unnecessary wordy, that makes it harder to read. Btw, is some code style checks like node-jscs are already implemented/planned?

danielweck commented 9 years ago

@pmstss thanks for the heads-up about JSCS. There's support for JSLINT in the current build system (could be used as a pre-compile step), but it's not wired-in at the moment (too many warnings / errors!) https://github.com/readium/readium-js-viewer/blob/master/package/package_scripts_build.cson#L35 https://github.com/readium/readium-cfi-js/blob/master/readium-build-tools/jshint_glob.js

danielweck commented 9 years ago

@pmstss URI.is("absolute") API now used: https://github.com/readium/readium-shared-js/commit/7d8bea0f9a8c453bba77214e1a8184b8a87be4ac (I'm also making this consistent in the other repositories)