ome / ngff

Next-generation file format (NGFF) specifications for storing bioimaging data in the cloud.
https://ngff.openmicroscopy.org
Other
119 stars 41 forks source link

List sources of sample data #140

Open joshmoore opened 2 years ago

joshmoore commented 2 years ago

@folterj was looking for sample datasets and came across three (unsuitable) locations:

The spec page itself should be the definitive starting point (for now) to find sample data.

folterj commented 2 years ago

Hi @joshmoore I didn't realise it's possible to open a Zarr by URL and after updating packages (in particular the obscure 'aiohttp') this is working fine. It may be nice to have some smaller samples as downloadable archives, but I understand the whole point of distributed/Zarr is the online access.

joshmoore commented 2 years ago

@folterj: so you would want Zips for a single download, rather than needing to use aws or mc to download from S3? Btw, ome-zarr-py provides a download method, but dedicated tools are more scalable.

folterj commented 2 years ago

@joshmoore yes exactly. By the way thank you also for making me aware of this page including up-to-date v0.4 samples - this is really nicely done! https://idr.github.io/ome-ngff-samples/

joshmoore commented 2 years ago

@joshmoore yes exactly.

Understood. We'll look into putting some more (smaller) samples up on Zenodo, but for now you can see find a handful under:

https://zenodo.org/search?page=1&size=20&q=ngff&access_right=open&type=dataset

jluethi commented 2 years ago

Following up on today's OME2022 call: Happy to offer the small example OME-Zarr datasets we use for testing purposes and have put on Zenodo, e.g. this one: https://zenodo.org/record/7144919 It passes the v0.4 ome-ngff-validator. It already contains some tables with measurements and custom ROIs (which we will make v0.5 spec compliant once this spec definition has finished).

Also, we have this tiny dataset that's just 17 / 32 MB (2D vs. 2 planes in 3D) we use in some of our automated testing: https://zenodo.org/record/7274533

jburel commented 2 years ago

Gathering links following ome2022 call

imagesc-bot commented 1 year ago

This issue has been mentioned on Image.sc Forum. There might be relevant details there:

https://forum.image.sc/t/ome-ngff-test-dataset/74436/5

joshmoore commented 1 year ago

Discussing with @erindiel, @jburel, and @pwalczysko today regarding the upcoming publication, there's a general sense that along with ngff.openmicroscopy.org we can maintain a single page "data resources" page that then points to:

A similar strategy is likely to be followed for a top-level landing page for "tool resources" which then in turn links the tools (as https://ome.github.io/ome-ngff-tools does) as well as discovered repositories.