BostonGene / Kassandra

Bostongene cell deconvolution algorithm from RNAseq
Other
50 stars 7 forks source link

Input difference for TUMOR and BLOOD model #7

Open dano602 opened 1 year ago

dano602 commented 1 year ago

Hello. I have encountered an issue with Kassandra while using it on a webpage. We have generated input for Kassandra with Kallisto and used the webpage to upload data. However, only TUMOR model seems to be functional for us. When we use the same input and pick BLOOD model, we will get error message saying: Model does not work with log normalized data. Linearize your expression matrix. Data used as input are the same but changing model to BLOOD results in this error message. Do you know what is the issue and how can I run BLOOD model? Thank you very much.

shpakb commented 1 year ago

Hello!

Please make sure that that your input expression are linear TPM and not log'ed. You can check it by making sure that all genes for each sample sums up to 1m TPM. If not, there is something wrong with your data. To check if it is log normalize you could check max expression value in each sample. If there is no values grater than lets say 50 for a sample it is probably log normalized. To linearize them you could use this python code:

expr = 2**expr - 1

I also would recommend you to check if your input to Tumor model was linear TPM. It seems we forgot to add check for logarithm in Tumor model. It might give you something back but it is probably far from proper results it would give you on right input values.

Cheers!

dano602 commented 1 year ago

hello. thank you very much for your reply. I checked again. Sum of all genes for indicvidual sample is 1M exactly. I tried again on online tool, the same input gets accepted for both models as previously, however I only recieve results for TUMOR model, for BLOOD model I get :Model does not work with log normalized data. Linearize your expression matrix.". Input for both models is the same so I dont understand how the issue could be in input.

shpakb commented 1 year ago

That is strange. I uploaded one of the expression files from this github and it worked for both models. I asked devops folks to check the model on the web page for the possible source of the different behavior of two models. In the meantime I would recommend you to double-check your expression matrix. Compare it to the files in this repository, see if they are in the same format.

shpakb commented 1 year ago

It appears blood model fails if passed expression in transcripts. This soon will be fixed. You can try to run expression in genes right now. Will work for both models.

dano602 commented 1 year ago

thank you again for your response. I checked my input with the example provided on the web page, it looks identical to me. However, I noticed in the sample data that Sample colum do not add up to 1M in the example file. This example file does not work when you select transcripts as input and pick Blood model. I just tried to pick option :expression of genes (arbitrary form of input) as suggested and both Blood and Tumor models work for your input, but only Tumor works for mine. I get message: "There were errors while processing your request using deconvolution by Kassandra Tumor model."