Open cjiang310437 opened 1 month ago
First, thank you @SeppeDeWinter and Aertslab for your amazing tool!
I am wondering myself the same question. Also, I was analyzing my gene expression matrix data before and after performing Simulation Perturbation. What I have noticed is that before I run the Perturbation, the data I obtain is normalized data. I thought that for this analysis, raw data would also be included and that the function would internally normalize and/or transform it:
raw_data = adata.to_df()
gex_scenic = scplus_mdata["scRNA_counts"].to_df()
The perturbed matrices take this data as a basis for their simulation analysis, and the perturbed data, I think already appears normalized, like this:
simulation_scenic = perturbation_over_iter[5]
So, I am not sure if this (starting the analysis with normalized data) is significantly altering the Perturbation analysis by already using normalized data? Or is it fine to use it this way? I dont have big experience analyzing data, especially knockouts, but any information would be very useful. Thank you.
Hello,
First of all, thank you for developing such a powerful and intuitive tool.
Describe the bug According to the documentation, the RNA count input for the SCENIC+ pipeline should consist of raw counts, which are expected to be stored in the adata.raw slot. However, after following the tutorial's steps, I observed that the adata.raw slot seems to be overwritten with normalized counts instead of retaining the raw counts.
Here are the key details:
adata.raw
now contains the normalized data, not the raw counts.adata.raw
slot should contain raw counts and that these should be used as input for the SCENIC+ pipeline.Additionally, I tested running the pipeline using both raw and normalized RNA counts, and the results were significantly different. The results generated using normalized counts seem more promising. Could you kindly clarify which input (raw or normalized counts) is appropriate for running the SCENIC+ pipeline? It would also be helpful to understand why one input should be preferred over the other and how this impacts the pipeline results.
I appreciate your guidance and look forward to your response. Thank you again for your continued efforts in developing and maintaining this tool.
To Reproduce
Error output 1384.0 6.714874659931793
Expected behavior The expected is that adata.raw before and after normalization should be the same.
Screenshots
Version: