niu-lab / msisensor2

Microsatellite instability (MSI) detection for tumor only data.
GNU General Public License v3.0
88 stars 21 forks source link

inconsistent output between msisensor and msisensor2 #28

Open windtalker6 opened 2 years ago

windtalker6 commented 2 years ago

I test msisensor and msisensor2 in 2 samples(22SSP1000008 and 22SSP1000009)

both the two sample has tumor bam files and matched normal bam files.

for 22SSP1000008, which is MSI-H when using msisensor(tumor+normal):

Total_Number_of_Sites Number_of_Somatic_Sites % 948 124 13.08

which is MSI-H according to https://github.com/ding-lab/msisensor/issues/29 (cutoff: 10%)

msisensor2 result:

Total_Number_of_Sites Number_of_Somatic_Sites % 159 9 5.66

which is MSI-L or MSS according to https://github.com/niu-lab/msisensor2/issues/3 (cutoff: 20%)

while for 22SSP1000009, which is MSS

msisensor produced :

Total_Number_of_Sites Number_of_Somatic_Sites % 1010 12 1.19

msisensor2 output:

Total_Number_of_Sites Number_of_Somatic_Sites % 159 2 1.26

exactly the same, both of them are MSS.

Beifang commented 2 years ago

Thanks for your test on msisensor and msisensor2, and msisensor2 uses ~3000 site models trained from TCGA ~1500 WES samples. We can not gurante 100% AUC for WES tumor only data, so we absolutely recommend using msisensor if tumor-normal paired data in place.

I suppose these two samples sequencing data should be based on some kind of custrom gene panel in which there are only 159 msisensor2 models. So, I think the recommended 10% cutoff is not suitable for this custom gene panel.

windtalker6 commented 2 years ago

Thanks for your test on msisensor and msisensor2, and msisensor2 uses ~3000 site models trained from TCGA ~1500 WES samples. We can not gurante 100% AUC for WES tumor only data, so we absolutely recommend using msisensor if tumor-normal paired data in place.

I suppose these two samples sequencing data should be based on some kind of custrom gene panel in which there are only 159 msisensor2 models. So, I think the recommended 10% cutoff is not suitable for this custom gene panel.

yes, these two samples are sequenced using a panel of ~700 genes.

for msisensor, the 10% cutoff value is ok, which can differentiate MSI-H and MSS. while for msisensor2, the 20% cutoff seemed to be two high for my data

windtalker6 commented 2 years ago

Thanks for your test on msisensor and msisensor2, and msisensor2 uses ~3000 site models trained from TCGA ~1500 WES samples. We can not gurante 100% AUC for WES tumor only data, so we absolutely recommend using msisensor if tumor-normal paired data in place.

I suppose these two samples sequencing data should be based on some kind of custrom gene panel in which there are only 159 msisensor2 models. So, I think the recommended 10% cutoff is not suitable for this custom gene panel.

then can you give me some advice on choosing a new cutoff value for msisensor2 ?

huangl07 commented 1 year ago

I wonder is there any cut-off threshold for the msisensor2 to difined msi-L

thank you for your attention