BIMSBbioinfo / ikarus

Identifying tumor cells at the single-cell level using machine learning
MIT License
45 stars 12 forks source link

Looks like reversed labeling - tumor <=> normal #22

Open wgsim opened 2 months ago

wgsim commented 2 months ago

Hi, thank you for developing a nice tool!

I've tried to use your Ikarus in my single-cell RNAseq datasets, including in vitro cultured normal cells and in vivo developed tumor cells.

It is mouse cells, so I had to transfer gene symbols into human orthologs.

After that, I did model prediction based on your signature.gmt and core_model.joblib, with adapt_signatures = True option because overlap genes are less than 80 %.

I could get the prediction results, but when I checked the results, all my in vitro cultured normal cells were labeled as tumor cells and most of my in vivo tumor cells were labeled as normal cells.

Is there any chance that model predictions were reversed? or is it just because my samples have weird expression patterns?

Please give me any advice.

Thank you!

melonheader commented 3 weeks ago

Hello @wgsim,

Thanks a lot for trying out our software and apologies for a delayed response.

It does indeed sound as if the signatures were reversed at some point. However, there might be something wrong with scoring as well. To rule it out, could you kindly check the distributions of signature.gmt scores across your normal and cancer cells? Another thing to check is how many genes from the signature.gmt are actually present in the dataset.

Best,