Error in Bulk RNA Deconvolution with Omicverse

Starlitnightly / omicverse

A python library for multi omics included bulk, single cell and spatial RNA-seq analysis.

https://starlitnightly.github.io/omicverse/

GNU General Public License v3.0

431 stars 46 forks source link

Error in Bulk RNA Deconvolution with Omicverse #100

Open sancho-o opened 3 months ago

sancho-o commented 3 months ago

Dear Omicverse Team,

I want to congratulate you on the outstanding work you've done with Omicverse!

I am writing to seek your assistance with an issue I encountered while trying to deconvolute bulk RNA data using Omicverse. I started with normalized counts and utilized the same reference genome as in your tutorial, but I encountered the following error:

RuntimeError: mat1 and mat2 shapes cannot be multiplied (9x12374 and 12371x256)

I'm unsure about the cause of this error and would greatly appreciate your guidance on this matter.

Thank you very much for your assistance and for your ongoing contributions to the scientific community. I look forward to your advice.

Screenshot from 2024-07-04 21-18-28

Regards, Osama

Starlitnightly commented 3 months ago

This error could be due to the fact that you have duplicate names in your genes.

Regards, Zehua

sancho-o commented 2 months ago

Thank you @Starlitnightly for your prompt response. Your suggestion resolved the issue.

I have a follow-up question. Is it normal for results to vary in this manner? In my initial analysis, a specific cell type was higher in the disease group compared to the control (Cell Fraction Results). However, in the second quantification, the cell type appeared higher in the control group (Cell Number Results), contrary to the first analysis results.

Could you please help me understand why this discrepancy might occur?

Thank you.

Regards, Osama

Starlitnightly commented 2 months ago

Did you double-check that the inputs were identical? @sancho-o

Starlitnightly commented 2 months ago

When I tested the algorithm, the three datasets from manuscript were consistently robust on cell fraction, although the generated scRNA-seq would vary.

sancho-o commented 2 months ago

Thank you @Starlitnightly for your prompt response.

I ran the entire script in one go and followed the exact tutorial on the website, so the inputs should be accurate. Do you know if there is any specific reason for this outcome as it seems entirely contrary to the expected results.

Regards, Osama

sancho-o commented 2 months ago

I understand that the generated scRNA-seq could vary, but having results that are completely opposite to expectations is somehow confusing. Additionally, this outcome is puzzling because these cell numbers are for (immune cells), which should typically be higher in disease than in the control.

Starlitnightly commented 2 months ago

Oh I thought you meant that the predicted Cell Fraction was inconsistent between the two runs, the algorithm for the Cell Fraction prediction is fine-tuned from TAPE & Scaden, and theoretically it should remain consistent with TAPE & Scaden, if something is wrong with that part it could be an inherent limitation of TAPE & Scaden.

sancho-o commented 2 months ago

Thank you for your response. I understand the points you've raised.

However, I do not face any problems with TAPE or cell fraction prediction results. The actual issue is that the predicted cell numbers resulted from the VAE model do not match the results of TAPE, and instead, they give opposite results. This discrepancy is what I am currently facing and inquiring about.

Starlitnightly commented 2 months ago

@sancho-o ,This sounds very strange, can you provide me with a complete example data and code? I'd like to try to reproduce this issue or fix it in the next release.