vanvalenlab / caliban-toolbox

Data engineering toolbox for DeepCell
Other
7 stars 1 forks source link

Feature requests for data loader #87

Open ngreenwald opened 4 years ago

ngreenwald commented 4 years ago

Right now, if you specify two different channels, the data loader will load them serially into the same array, so that it looks like you have 6 FOVs of the same channel, when in fact you have 3 FOVs, each with two channels.

I know you had talked about putting some SQL like logic in place to specify combinations of markers, I'm not sure if the plan is to add that to the data loader itself or as an internal post-processing step once the images have been loaded.

A couple of additional things that would be nice to have, but not mission critical, to improve the data loader

  1. The data loader doesn't read .tiff files, only .tif files
  2. It would be great to check if the specified folder is empty: right now it fails with pd.concat error.
ngreenwald commented 4 years ago

If we have a folder of 10 images, we want to be able to categorize them, take only a subset for a certain job, etc etc. To enable this, the data loader will need to return the filenames of the images that it has loaded, whereas currently it just returns the path to the folder itself.