Reac HCS data from IncuCyte Microscope

constantinpape commented 11 months ago

Hi @tischi , we would like to read HCS data from Incucyte microscopes.

The corresponding data is stored over multiple folders, where each folder contains the data for a given timepoint. E.g.

<timepoint1>/<vessel_id>/
  B2_1_Ph.tif
  B2_1_Ch1
  B2_2_Ph.tif
  B2_2_Ch1.tif
  ...
<timepoint2>/<vessel_id>/
  B2_1_Ph.tif
  B2_1_Ch1.tif
  B2_2_Ph.tif
  B2_2_Ch1.tif
  ...

Here, B2 is the well id, 1 the position in the well and Ph (phase contrast) and Ch1 (a fluorescence channel) are the channel names. vessel_id is a unique identifier for all experiments stored on the microscope.

I have put some example data in this format on the EMBL S3: i2k-2020/incu-test-data/2207/19. (Contains two timepoints with a few wells, 25 positions per well and 2 channels).

What do you think would be the best way to open this data via the MoBIE HCS loader?

tischi commented 11 months ago

I worked on things like this. As long as all (i) the information can be parsed from the file and folder naming scheme and (ii) the individual TIFF files can be opened in Fiji it should be no problem.

Can you please share a minimal example data set with me?

constantinpape commented 11 months ago

I have put some example data in this format on the EMBL S3: i2k-2020/incu-test-data/2207/19. (Contains two timepoints with a few wells, 25 positions per well and 2 channels).

This is the minimal example data. On the EMBL cluster you can copy it via mc cp -r embl/i2k-2020/incu-test-data/2207/19 . But I will also share an owncloud link with you to make it easier.

constantinpape commented 11 months ago

I send a download link via EMBL Chat @tischi

tischi commented 11 months ago

This appears to be multi-resolution data:

I am not sure we can handle this from a TIFF, but maybe we can, I am not sure, I will have a look.

tischi commented 11 months ago

@constantinpape would it also help you if that data could be conveniently converted to an OME-Zarr plate?

tischi commented 11 months ago

This may work to open TIFF from S3, using our current mobie-io code base:

InputStream inputStream = IOHelper.getInputStream( s3address );
ImagePlus imagePlus = ( new Opener() ).openTiff( inputStream, "name" );

tischi commented 11 months ago

I found that we have something in MoBIE already for Incuyte, but is seems to be a different variant:

    /*
    example:
    MiaPaCa2-PhaseOriginal_A2_1_03d06h40m.tif
    well = A2, site = 1, frame = 03d06h40m
     */
    private static final String INCUCYTE = ".*_(?<"+WELL+">[A-Z]{1}[0-9]{1,2})_(?<"+SITE+">[0-9]{1,2})_(?<"+ T +">[0-9]{2}d[0-9]{2}h[0-9]{2}m).tif$";

Do you have any insights here?

constantinpape commented 11 months ago

Hi @tischi , thanks for looking into this so fast. Regarding the questions:

multi-res: I think we can just take the highest resolution. It would of course be nice if you can actually use this in the MIP of BDV, but not really needed.
would it also help you if that data could be conveniently converted to an OME-Zarr plate? That would be nice, but not a high priority. We have too much data to convert everything, and in order to keep things compatible with other software we (for now) have to keep a copy in the original format.
This may work to open TIFF from S3, using our current mobie-io code base Ok, good to know! (But I suggest we first figure out how to parse the format in principle and then how to also load from S3.)
I found that we have something in MoBIE already for Incuyte, but is seems to be a different variant. Yes, the data is exported from the microscope in a different format (the one you have) to how it is stored. We now have a lot of data and cannot export all of it (because this would mean duplicating the data, and also there isn't a good programmatic way for it). That's why we want to access the 'storage incucyte format'.

tischi commented 11 months ago

Does that mean that IncuCyteRaw would be a good name for this?

constantinpape commented 11 months ago

Yes, that would be good!

tischi commented 11 months ago

Are these multiple plates (the example data)?

I am assuming this is one plate?

incu-test-data/2207/19/1110/262

constantinpape commented 11 months ago

It's a single plate, but imaged for multiple timepoints:

incu-test-data/2207/19/1110/262 is one timepoint (imaged on the 19.07.2022 at 11:10; 262 is the experiment id)
incu-test-data/2207/19/1120/262 is another timepoint (imaged on the 19.07.2022 at 11:20)

tischi commented 11 months ago

Ok man...Ok, I guess I could parse this such that the timepoints are correct.

Does that look OK?

constantinpape commented 11 months ago

Does that look OK?

Looks correct on first glance. (To make sure I would need to load it myself and compare with individual images loaded in napari or Fiji)

tischi commented 11 months ago

https://github.com/mobie/mobie-viewer-fiji/pull/1077

tischi commented 11 months ago

[x] parse timepoints.

tischi commented 11 months ago

[ ] https://github.com/BIOP/bigdataviewer-image-loaders/issues/21

tischi commented 11 months ago

Turns out that the distribution of time points across multiple files is as challenge here.

I am having code for this, but currently using the VirtualStack from ImageJ to both concatenate and lazy load the time points for different files. The issue now is this only works if the files can be opened with the ImageJ1 Opener, because this is what VirtualStack uses to load data. This does not work here because we need Bio-Formats. In addition, using this approach we will not be able to make use of the resolution pyramid.

If we do not care about the resolution pyramid, we could implement a modified version of the VirtualStack that uses Bio-Formats to open the files instead of the ImageJ1 Opener class.

Another potential avenue that would preserve the resolution pyramid: https://github.com/BIOP/bigdataviewer-image-loaders/issues/22

tischi commented 11 months ago

@constantinpape using the above VirtualStack approach this works now also for the timepoints (without the pyramid). The current main branch should work, you can use this function to testing. Can you test this from the branch (preferred) or shall I release it to Fiji?

It would be good to know if this is usable enough for a whole plate on disk, because I think from S3 it will only be worse.

constantinpape commented 11 months ago

Thanks @tischi , I will test it from the branch on Sunday or Monday.

tischi commented 11 months ago

@constantinpape

I managed to also implement it for S3 🥳 .

You can try it in the same function that I linked to above.

Notes:

Currently, for each site we download the entire TIFF file with all resolutions, but we only make use of the highest resolution.
I am not sure about the memory management:
- It could thus be that zooming out to view the whole plate is not a good idea; you may have to teach your users to move from well to well. Anyway, this would probably take too long anyway to load.
- I also don't know what would happen if you visit the whole plate well by well; I don't know if the current implementation would free the memory of previously visited wells or whether this would pile up.

Some of the above limitations could be probably be improved, but this would probably require upstream contributions from @nicokiaru in bigdataviewer-image-loaders; that is we would need to see whether the memory mapping trick for BioFormats to load objects from S3 (see discussion here and implementation here) could be implemented in bigdataviewer-image-loaders. This would give us the possibility to make use of the resolution pyramid and probably also better memory management.

constantinpape commented 11 months ago

Hi @tischi ,

I tested it now, and it works really well! The loading speeds are good both when loading the data from local files and from s3. I didn't try to zoom to the full plate level with s3, but hopping between wells was working quite well (with some loading delays, but still usable).

I think on the technical level that is all we would need for now; for better performance we may need to convert selected data to ome-zarr; but it's already great to have the functionality as it is to quickly check the uploaded data.

There are two more things that would be helpful in addition:

specifying the locations of all folders for a well from a text file / table
specifying per table metadata

I would suggest to first go ahead and merge the current changes and then I can lay this out in more detail in a separate issue.

tischi commented 11 months ago

I released it.

mobie / mobie-viewer-fiji

Reac HCS data from IncuCyte Microscope #1075