readium / architecture

📚 Documents the architecture of the Readium projects
https://readium.org/architecture/
BSD 3-Clause "New" or "Revised" License
175 stars 33 forks source link

Specific repo for test samples (Casting your vote) #84

Open JayPanoz opened 5 years ago

JayPanoz commented 5 years ago

As discussed in the Readium Engineering call yesterday (16 January), we have quite a significant amount of test samples in the form of EPUB files:

It’s important to note their scope and goals are different from the EPUBTest suite, and some more could eventually be created for some things like metadata parsing, etc.

My biggest concern is that those files, if considered useful, are hidden in multiple repos right now, while they are not strictly speaking tightly-coupled to those repos; you could indeed build/use another module and those files could still be used for manual testing in your reading app.

In terms of discoverability and ease of use, a specific repo would make sense.

Pros:

(In addition, maybe the src – i.e. exploded EPUB – files could also be used for unit testing but I’m not sure about that as it depends on a lot of factors so just mentioning that instead of listing it.)

Cons:

There may also be ramifications @ the ecosystem level, esp. as testing is being discussed for EPUB 3.2.


We’ve agreed to let people think about such a project for the next 2 weeks. So please feel free to use this issue to provide feedback and or/cast your vote.

To cast your vote:

You can either pick 👍 or 👎 in Reactions (to this message), which is the smiley icon:

capture d ecran 2019-01-17 a 12 45 13

Or in a message (or something similar e.g. “LGTM”, detailed opinion, etc.).

Votes will be closed on January, 31.

Thanks in advance. :-)

danielweck commented 5 years ago

I am aware that some of us internally/privately make use of cloud hosting to share access to test EPUBs, but as you can imagine this work practice severely lacks structure and documentation (it's really just a bunch of huge EPUB listings sorted alphabetically, by size or date).

So +1 to a new central GitHub repository, with some methodology attached to it :)

danielweck commented 5 years ago

For my own tests I use the IDPF samples on a regular basis (I don't even bother downloading all EPUBs, as the Readium2 TypeScript software can load remote publications, packed or unpacked): https://github.com/IDPF/epub3-samples/tree/master/30 https://idpf.github.io/epub3-samples/30/samples.html

There are the DAISY a11y tests as well: https://github.com/daisy/epub-accessibility-tests/tree/master/content

...and of course the EPUB testsuite: https://github.com/IDPF/epub-testsuite/tree/master/content/30

Some time ago for readium-js-viewer (Readium1 cloud reader) we used a script to automate the creation of OPDS1 feeds, by crawling the above URLs (automatically extracting covers, basic metadata, etc.): https://github.com/readium/readium-js-viewer/blob/master/epub_content/idpf_samples.opds https://github.com/readium/readium-js-viewer/blob/master/epub_content/epub_testsuite.opds https://github.com/readium/readium-js-viewer/blob/master/epub_content/epub_tests_a11y.opds The script: https://github.com/readium/readium-shared-js/blob/master/readium-build-tools/genOPDS.js

There's a similar Readium2 TypeScript routine, to generate the OPDS2 feed for a folder of local publications served by the streamer (it's not a fully-baked feature, just a handy micro-service to ease the pain of testing many files, many repeated times): https://github.com/readium/r2-streamer-js/blob/develop/src/http/opds2-create-cli.ts

So as you can guess, my feature request is: create/refresh static OPDS 1 and 2 feeds every time our new proposed "publication samples" repository gets updated (Git commit hook / TravisCI build trigger?), or alternatively generate the feeds dynamically (I prefer the former option).

Voila ;)

danielweck commented 5 years ago

For stress-testing the database backend of a reading system, or its bookshelf / OPDS user interface, we can fetch public-domain resources from places like Standard Ebooks (although in this particular case, the caveat is that their EPUB structure is very consistent by design, hence not useful for hitting corner-cases):

Their GitHub (one repo per publication): https://github.com/standardebooks

Their website / browsing interface: https://standardebooks.org/ebooks/

Their OPDS feed: https://standardebooks.org/opds/all

(PS / PSA: they use the .epub3 extension which fails in reading systems that filter .epub filenames, naturally)

Over time I've aggregated a significant number of publicly-accessible OPDS feeds, which I find useful for testing too: https://github.com/readium/r2-streamer-js/blob/develop/docs/opds.md#a-selection-of-public-opds-feeds (bearing in mind, the acquisition links may be "indirect", and not public-domain)

If you know other OPDS feeds worth considering, please PR :) (sorry for the plug)

JayPanoz commented 5 years ago

Ah thanks Daniel for your outstanding listing of resources.

I didn’t even think about having a list of other useful feeds/samples but that’s definitely something that could also help and would fit in such a dedicated repo i.e. “if you want more complicated samples to test your app, check [x, y, z, …].”

JayPanoz commented 5 years ago

As a (natural) extension of our r2-glue-js call, there’s been very interesting feedback and ideas about this repo.

For instance:

Once again this is different from EPUBTest in terms of scope and goals.

rkwright commented 5 years ago

Note that there are more test files of various provenance here:

https://github.com/readium/readium-test-files

but they need curation as well.

JayPanoz commented 5 years ago

Note that if this repo happens, I’d probably volunteer to help setting it up (samples + why not templating, as I already did some research about that in the past actually).

My bandwidth being very low right now, that wouldn’t be before April though – and I’d prefer to dedicate this limited time to some low-hanging fruits in ReadiumCSS first.

gautierchomel commented 4 months ago

+1 to a new central GitHub repository, with some methodology attached to it :)