alan-turing-institute / scivision

scivision: a framework for scientific image analysis
https://sci.vision/
BSD 3-Clause "New" or "Revised" License
94 stars 40 forks source link

[ENH] Datasets with labels (and in CSV format) #251

Open miquelmassot opened 2 years ago

miquelmassot commented 2 years ago

Feature Request

I'd like to be able to load CSV files with either image URLs in it or local image filepaths. I've tried setting up a dataset plugin repo with a CSV file, but I have no clue on how to make it work with any other scivision model. What are the requirements in terms of data types and structure for common models?

The repository I started is available at https://github.com/miquelmassot/squidle_scivision

Describe the solution you'd like

I was thinking of a scivision plugin, capable of discerning between actual intake catalogs that already exist to CSV files as the one I've described. I would suppose users would need to tell the "loader" which is the column for the image and which are the columns for the metadata the user is interested in. Even so, users could provide a groundtruth class for training or validation.

miquelmassot commented 2 years ago

[UPDATE] I've been able to load images into a dask dataframe - I have modified the intake CSV driver to do that, code is available at miquelmassot/squidle-scivision What I don't know now is how to make it usable for scivision...

edwardchalstrey1 commented 2 years ago

Just adding some notes to this issue as it came up again recently:

ots22 commented 1 year ago

scivision can’t make any use of the labels you have in the filenames of the dataset once you get to xarray. If we can do this already, make very clear how to in docs

Tangential to the main point of this issue, but I just wanted to note that the above is already possible - it is an intake-xarray feature, rather than a scivision feature.

This line of the intake catalog should result in an xarray coordinate named 'filename'.

I agree it would be good to have a more prominent example since it can be quite useful.