icbi-lab / Immunophenogram

BSD 3-Clause "New" or "Revised" License
40 stars 26 forks source link

All samples has very high score (all > 6, most of them are 10) #3

Open AlfredShawn opened 6 years ago

AlfredShawn commented 6 years ago

Hi Folks,

I tried to run this against multiple projects (different cancer type including melanoma, HNSCC, RCC), the reported scores for all samples are over 6, most of them are equal to 10. I read the code and check the results step by step, and found the MHC proportion (many HLA genes) are scaled high because the genes' expression are really high. The normalization step tried to scale them into -3 to 3 if I am right. But it seems like this is not suitable for all data I have. Interestingly, even all the samples from the example data in this package also have high scores (all over 6). Could you help on this? Thanks.

Yu

ahdee commented 5 years ago

@AlfredShawn Over a year later: yes I getting the same thing. I run a histogram and the majority of the IPS score is 10. There is something wrong I think. Did you ever resolved this? Its not the scaling issue since that is just for visual. Even the example does not seem to match what I downloaded from TCGA. For example the sample: TCGA-04-1348 has the following values VPS13A-AS1 0.0000 UBE2Q2P2 1.5835 HMGB1P1 2.0365 TIMM23B 6.6362 MOXD2P 0.0000 However this is very different than what I got for the same sample downloaded else where log2 ( tpm + 1 ) TCGA.04.1348.01 VPS13A-AS1 0.0000000 UBE2Q2P2 2.5410171 HMGB1P1 0.5361526 TIMM23B 2.0000036 MOXD2P 0.0000000 I don't expect the values to match completely but the ratio should be similar? Very confused. Also I went ahead and calculated about 10K TCGA samples and clearly this histogram shows that there is somethign majorly wrong with this.
4KHLeHA 1