readium / readium-js-viewer

👁 ReadiumJS viewer: default web app for Readium.js library
BSD 3-Clause "New" or "Revised" License
559 stars 186 forks source link

Basic OPDS support in library view (epub_library.json ATOM/XML adapter) #412

Closed danielweck closed 9 years ago

danielweck commented 9 years ago

Example: http://development.readium.divshot.io?epubs=URL with URL: http://www.feedbooks.com/books/free.atom or opds://www.feedbooks.com/books/free.atom (note the OPDS URI protocol)

Also public domain: http://www.feedbooks.com/books/top.atom?range=month (see http://www.feedbooks.com/catalog.atom )

danielweck commented 9 years ago

Addressed here: https://github.com/readium/readium-js-viewer/commit/3d78d6f6a3110a11b03d1792607705bf6d469c6e

danielweck commented 9 years ago

Tested with:

http://www.feedbooks.com/store/selection.atom ( http://readium.surge.sh/?epubs=http://www.feedbooks.com/store/selection.atom )

http://www.feedbooks.com/books/top.atom?range=month ( http://readium.surge.sh/?epubs=http://www.feedbooks.com/books/top.atom?range%3Dmonth )

http://www.feedbooks.com/books/free.atom ( http://readium.surge.sh/?epubs=http://www.feedbooks.com/books/free.atom )

http://www.feedbooks.com/featured.atom ( http://readium.surge.sh/?epubs=http://www.feedbooks.com/featured.atom )

http://www.feedbooks.com/store/top.atom ( http://readium.surge.sh/?epubs=http://www.feedbooks.com/store/top.atom )

http://www.feedbooks.com/store/recent.atom ( http://readium.surge.sh/?epubs=http://www.feedbooks.com/store/recent.atom )

OPDS catalog navigation: http://readium.surge.sh/?epubs=http://www.feedbooks.com/catalog.atom

danielweck commented 9 years ago

EPUB Testsuite: http://s3.amazonaws.com/epub3.nypl-labs.biz/opds/index.xml (no HTTP CORS headers)

OPDS test catalog: http://feedbooks.github.io/opds-test-catalog/ Root (no acquisition): opds://feedbooks.github.io/opds-test-catalog/catalog/root.xml Example acquisition set: http://feedbooks.github.io/opds-test-catalog/catalog/acquisition/blocks.xml

OPDS repository: http://wiki.mobileread.com/wiki/OPDS

Note that the internet archive feeds fails to load because of the lack of HTTP CORS headers, e.g. http://bookserver.archive.org/catalog/downloads.xml http://bookserver.archive.org/catalog/

Oreilly: http://opds.oreilly.com/opds/

Same goes for Gutenberg: http://gutenberg.org/ebooks/50050.opds http://m.gutenberg.org/ebooks.opds/ http://m.gutenberg.org/ebooks/?format=opds (which also rejects CORS proxies such as http://crossorigin.me/http://gutenberg.org/ebooks/50050.opds )

...and PragProg: https://pragprog.com/magazines.opds

...or BaeneBooks: http://www.baenebooks.com/stanza.aspx?feed=free http://www.webscription.net/stanza.aspx

...as well as Revues.org http://bookserver.revues.org/?sort=OA

...or Atoll Digital Library: http://atoll-digital-library.org/opds/feed.php?page=5&id=1&db=3

...and for SmashWords: http://www.smashwords.com/lexcycle/books/1/newest/epub/any/0

...or ManyBooks: http://manybooks.net/opds/title_detail.php?tid=marquezrother15astrayincouper

...and EpubBud: http://www.epubbud.com/feeds/recent.atom

as well as Ebooks-Gratuits.: http://www.ebooksgratuits.com/opds/feed.php

...or Atramenta: http://www.atramenta.net/opds/latest.atom?type=public_domain

...or Tatsu-Zine: http://tatsu-zine.com/catalogs/all.opds

...and BeamBooks: http://stanza.beam-ebooks.de/stanza/xmlcatalog.php5?art=neue

...EpubBooks.ru http://www.epubbooks.ru/lastadd.xml (CORS proxy rejected)

..and Tuebl.ca: http://tuebl.ca/catalog/newest

...as well as Flibusta: http://flibusta.net/opds/new/0/new

..or CoolLib: http://coollib.com/opds/new/0/new

http://chitanka.info/catalog.opds

Litres RU: http://opds.litres.ru/ http://data.fbreader.org/catalogs/litres/index.php

Hungarian: http://bookserver.mek.oszk.hu

http://flaschenpost.piratenpartei.de/catalog/

Bad acquisition links (authentication?): http://opds.youscribe.com/Catalog/catalog.xml

CORS proxy forbidden: http://ebooks.qumran.org/opds/?lang=en http://blah.me/opds/index.atom http://books.blah.me/index.atom

http://uread.superfection.com/log/stanza.cgi

http://lib.rus.ec/opds

http://www.booksonboard.com/xml/catalog.atom

http://uread.superfection.com/log/stanza.cgi

http://iknigi.net/opds

http://www.wolnelektury.pl/opds

http://www.zone4iphone.ru/catalog.php

https://www.gitbook.com/api/opds/catalog.atom

http://eforge.eu/OPDS/_catalog/index.xml

This feed has PDF, no EPUB: http://lupa.biblhertz.it/feed/lupa.atom

danielweck commented 9 years ago

Note that CORS proxy-ing via a custom HTTP server could be used to work around the above limitations. For example:

http://crossorigin.me https://github.com/technoboy10/crossorigin.me

https://cors-anywhere.herokuapp.com https://github.com/Rob--W/cors-anywhere

http://www.whateverorigin.org https://github.com/ripper234/Whatever-Origin

https://github.com/limtaesu/alloworigin

danielweck commented 9 years ago

Without CORS proxy (doesn't work): http://readium.surge.sh?epubs=http%3A%2F%2Fpragprog.com%2Fmagazines.opds

With CORS proxy (works): http://readium.surge.sh?epubs=http%3A%2F%2Fcrossorigin.me%2Fhttps%3A%2F%2Fpragprog.com%2Fmagazines.opds

Readium is agnostic to which CORS proxy is used (in this example: http://crossorigin.me), but will pick-up the URL based on the template {PROXY}/http:// or {PROXY}/https://.

See: https://github.com/readium/readium-js-viewer/commit/9638bc5f0b1eca228c1f132072d05a6aaf23b1b7

danielweck commented 9 years ago

Aggregated OPDS feeds, for testing:

http://readium.surge.sh/?epubs=https%3A%2F%2Fdl.dropboxusercontent.com%2Fu%2F585153%2FReadium%2Febooks%2Fopds_links.atom

danielweck commented 9 years ago

Potential feature extension: OPDS search via URI templates (OpenSearch http://a9.com/-/spec/opensearch/1.1/ http://www.opensearch.org )

e.g. http://bookserver.archive.org/catalog/ =>

<link
rel="search"
type="application/opensearchdescription+xml"
href="http://bookserver.archive.org/catalog/opensearch.xml"
/>

http://bookserver.archive.org/catalog/opensearch.xml =>

<Url
type="application/atom+xml"
template="http://bookserver.archive.org/catalog/opensearch?q={searchTerms}&amp;start={startPage?}"
/>

http://bookserver.archive.org/catalog/opensearch?q={searchTerms}&amp;start={startPage?} Example of an instantiated URI template: http://bookserver.archive.org/catalog/opensearch?q=arabic&amp;start=2 => http://readium.surge.sh/?epubs=http%3A%2F%2Fcrossorigin.me%2Fhttp%3A%2F%2Fbookserver.archive.org%2Fcatalog%2Fopensearch%3Fq%3Darabic%26start%3D2 (includes the http://crossorigin.me proxy to work around the lack of CORS headers at http://bookserver.archive.org)

Gutenberg: http://m.gutenberg.org/ebooks.opds/

<link rel="search" type="application/opensearchdescription+xml" title="Project Gutenberg Catalog Search" href="//m.gutenberg.org/catalog/osd-books.xml"/>
<opensearch:itemsPerPage>25</opensearch:itemsPerPage>
<opensearch:startIndex>1</opensearch:startIndex>
danielweck commented 8 years ago

Note about HTTPS requirement for cross-origin requests (e.g. Feedbooks OPDS catalog) https://readium.surge.sh https://readium.firebaseapp.com => both force HTTPS, which is becoming industry standard practice.

http://www.feedbooks.com => redirects to unsecure HTTP! https://www.feedbooks.com/catalog.atom ...so, we can in fact use a CORS proxy such as https://crossorigin.me https://readium.surge.sh/?epubs=https%3A%2F%2Fcrossorigin.me%2Fhttps%3A%2F%2Fwww.feedbooks.com%2Fcatalog.atom& ...which is fine to view library thumbnail covers, but unfortunately the actual EPUBs fail to load due to Feedbooks not working with proxied requests (bad HTTP headers), and also due to Feedbooks URL redirects for OPDS acquisition payloads (“preview” links).

See this sample DropBox-hosted OPDS feed aggregator (most linked feeds rely on HTTP CORS proxy): https://readium.surge.sh/?epubs=https%3A%2F%2Fdl.dropboxusercontent.com%2Fu%2F585153%2FReadium%2Febooks%2Fopds_links.atom => I expect that some of them will fail because of unsecure HTTP.

danielweck commented 8 years ago

OPDS validation links:

Main feed: https://opds-validator.appspot.com/?uri=https://raw.githubusercontent.com/readium/readium-js-viewer/develop/epub_content/epub_library.opds

EDUPUB feed: https://opds-validator.appspot.com/?uri=https://raw.githubusercontent.com/readium/readium-js-viewer/develop/epub_content/epub_edu.opds

EPUB3 Samples feed: https://opds-validator.appspot.com/?uri=https://raw.githubusercontent.com/readium/readium-js-viewer/develop/epub_content/epub_samples.opds

DAISY a11y feed: https://opds-validator.appspot.com/?uri=https://raw.githubusercontent.com/readium/readium-js-viewer/develop/epub_content/epub_tests_a11y.opds

EPUB Testsuite feed: https://opds-validator.appspot.com/?uri=https://raw.githubusercontent.com/readium/readium-js-viewer/develop/epub_content/epub_testsuite.opds

EPUB Widgets (ESC) feed: https://opds-validator.appspot.com/?uri=https://raw.githubusercontent.com/readium/readium-js-viewer/develop/epub_content/epub_widgets.opds

ReadBeyond (EPUB3 Media Overlays) feed: https://opds-validator.appspot.com/?uri=https://dl.dropboxusercontent.com/u/585153/Readium/ebooks/readbeyond.opds

Aggregated public domain feed: https://opds-validator.appspot.com/?uri=https://dl.dropboxusercontent.com/u/585153/Readium/ebooks/opds_links.atom

danielweck commented 7 years ago

Note: https://crossorigin.me seems to have been failing for some time now, due to missing request headers. https://cors-anywhere.herokuapp.com seems to be working fine. e.g.: https://readium.firebaseapp.com/?epubs=https%3A%2F%2Fcors-anywhere.herokuapp.com%2Fhttps%3A%2F%2Finstantclassics-beta.librarysimplified.org%2Findex.xml (from http://openebooks.net/catalog.html )

danielweck commented 7 years ago

Handy URLs (for testing OPDS support via CORS proxy):

https://readium.firebaseapp.com/?epubs=https%3A%2F%2Fcors-anywhere.herokuapp.com%2Fhttps%3A%2F%2Fopds.surge.sh%2Fopds_links.atom

https://readium.firebaseapp.com/?epubs=https%3A%2F%2Fcors-anywhere.herokuapp.com%2Fhttps%3A%2F%2Fopds.surge.sh%2Freadbeyond.opds