Obtain mapping inside training dataset

riu83 commented 6 months ago

Hi! Thank you for this interesting tool!

Rather than predicting unseen perturbation outcomes, I am interested in just finding a mapping between wild-type and perturbed cells in my training dataset. Is there a way to get this explicitly form the model? Or do I have to transport() my wild-type cells and find the closest perturbed cell for each? Thank you!

bunnech commented 6 months ago

If you are just interested in a mapping of the training set, then there is no need to actually train a neural network. You can simply compute an OT between your control and perturbed cells, as in https://ott-jax.readthedocs.io/en/latest/tutorials/application_biology.html.

You can of course also do this with CellOT. You can in general follow the evaluation script (https://github.com/bunnech/cellot/blob/main/scripts/evaluate.py). Just make sure that you load the data from the train set and not the test set then. Concretely, in line 209 - 210, you need to call the train dataloader, i.e.,

        control = dataset.train.source.adata.to_df()
        treated = dataset.train.target.adata.to_df()

Same for all the other locations in the evaluation script where previously dataset.test.source.adata.to_df() was called. Hope this helps!

riu83 commented 6 months ago

I see. Thank you!

bunnech / cellot

Obtain mapping inside training dataset #14