support for 5D images and changing the number of channels

mehta-lab / dynamorph

Learn morphological states of dynamic cells

BSD 3-Clause "New" or "Revised" License

12 stars 1 forks source link

support for 5D images and changing the number of channels #14

Closed smguo closed 3 years ago

smguo commented 3 years ago

Currently only 4D images are supported (t, c, y, x). Would be useful to change the data structure to accommodate z dimension (not necessary training 3D model, which can be slow due to memory limitation).

Also now the images are assumed to always have 2 channels (phase & retardance), and input and output channels are always the same (so can only train autoencoder). Could make this more flexible to enable training for different tasks (e.g. image translation)

mattersoflight commented 3 years ago

The next iteration of data structure can focus on providing flexibility for channels used as inputs and outputs. It will be important to extend the data structure to accommodate use of 5D data (t, c, z, y, x), which can be done along with switch to DataLoader (#15).

mattersoflight commented 3 years ago

@miaecle as you are updating the data structures and training loop, please try that channels used as inputs and outputs of VQ-VAE can be chosen arbitrarily. It can be interesting to examine the latent space of a model that translates one modality into another. Such a model may be more regular in biological sense.

miaecle commented 3 years ago

@mattersoflight Yeah fully agree. I am already switching some codes to accept arbitrary channel inputs, will merge that in a new PR soon.

Also just for consistency, right now the full pipeline takes (t, x, y, c) input, would that be fine? (or switching to (t, c, x, y))

In terms of the z dimension, will it be used more like a spatial dimension (there could be sliding window in z dimension) or channel dimension? If most inputs(and convolution operations) will be in 2D or 2.5D I believe it won't be hard to make the change, but adapting 3D inputs/convolutions could be a bit out of scope.

mattersoflight commented 3 years ago

@miaecle for thin specimens like microglia, the utility of z-dimension is to regularize the model since the latent vectors of a single z-stack should be close by. At this point, @smguo maps z-slices to time dimension to do that. We won't have 3D convolutions to encode 3D morphology in this model. Tensors of format (t, z, x, y, c) seem like a good extension to the current format of (t, x, y, c) - with the goal to compute the matching loss over t and z. What do you both think?

smguo commented 3 years ago

@miaecle @mattersoflight Yes I agree adapting 3D inputs/convolutions is out of scope. I was only thinking to expand the current data structure to accommodate both t and z dimension. As @mattersoflight mentioned one could apply matching loss to extra z-slices to further regularize the latent space, although applying matching loss to both t and z will require changes to reorder_with_trajectories and generate_trajectory_relations functions.

As for the dimension order, I think (t, x, y, c) is ok for now but might be good to switch to (t, c, z, y, x) to be consistent with the Pytorch dimension order convention at some point...

miaecle commented 3 years ago

@smguo Yeah that makes sense to me, I will switch everything to (t, c, z, y, x) then (is this equivalent to (t, c, z, x, y)?).

miaecle commented 3 years ago

@smguo @mattersoflight Everything should be fixed in PR #17