Open sepidism opened 10 months ago
Specifically, in evaluate_r2_sc, the input to compute_prediction() is y_true which means the input and the output are basically the same.
Hi @sepidism,
I am not sure if I understood your questions correctly but the encoder takes in the treated cells which are then embedded in a disentangled fashion (latent space arithmetics of basal state, perturbation state, cell state). After training, for evaluation, we compare to what extend the model is able to decode the ground truth cell signal by comparing it to the originally measured gene expression.
Hi @MxMstrmn Sorry for the confusion. My question is, during the test/ evaluation, the input is again the treated cell? In your code, the input to the model.predict() is the treated cell line during the evaluation in evaluate_r2_sc.
Hi @sepidism,
The input to the model are simply the control genes of all cell lines present in the dataset, not treatment at all. The treatment is inferred only from the metadata and then from the resulting embeddings which are added in the latent space.
Hi there, I wanted to double check something and would appreciate your help here. What is the initial embedding for each gene? I see you have the term "self.genes = torch.Tensor(data.X.A)" and then you pass that through the encoder and the rest of your arch. However, my issue is that, this way, your model sees the gene expressions for the treated cell lines and could influence your final results. Is that not a concern or im missing something here?