wwylab / DeMixT

GNU General Public License v3.0
32 stars 14 forks source link

Can I apply DeMixT for 26000 total count matrix? #22

Closed pigyun906 closed 1 year ago

pigyun906 commented 1 year ago

Hi, thank you for developing a nice tool

I followed the DeMixT tutorial to get TmS value described here (Estimation of tumor cell total mRNA expression in 15 cancer types predicts disease progression). But, I got an error not generating a matrix in DeMixT_preprocessing(). Maybe my raw count matrix consists of ~26000 genes, and you selected first ~9000 genes of about 50000 genes as followed GDC mRNA Analysis Pipeline. So, your selection criteria is too harsh for me. In this case, should I get ~50000 total count matrix? or how to apply it to my raw count data (~26000)?

cutoff_normal_range = c(0.1, 1.0)
cutoff_tumor_range = c(0, 2.5)
cutoff_step = 0.2
num_gene_remaining_different_cutoffs <- subset_sd_gene_remaining(DeMixT_CRC, label, 
                                                                 cutoff_normal_range, 
                                                                 cutoff_tumor_range,
                                                                 cutoff_step)
   normal.cutoff.low normal.cutoff.high tumor.cutoff.low tumor.cutoff.high num.gene.remaining
1                0.1                0.3                0               0.2                  0
2                0.1                0.3                0               0.4                  0
3                0.1                0.3                0               0.6                  0
4                0.1                0.3                0               0.8                  0
5                0.1                0.3                0               1.0                  0
6                0.1                0.3                0               1.2                  0
7                0.1                0.3                0               1.4                  0
8                0.1                0.3                0               1.6                  0
9                0.1                0.3                0               1.8                  0
10               0.1                0.3                0               2.0                  0
11               0.1                0.3                0               2.2                  0
12               0.1                0.3                0               2.4                  0
13               0.1                0.5                0               0.2                  0
14               0.1                0.5                0               0.4                  0
15               0.1                0.5                0               0.6                  0
16               0.1                0.5                0               0.8                  0
17               0.1                0.5                0               1.0                  0
18               0.1                0.5                0               1.2                  0
19               0.1                0.5                0               1.4                  0
20               0.1                0.5                0               1.6                  0
21               0.1                0.5                0               1.8                  0
22               0.1                0.5                0               2.0                  1
23               0.1                0.5                0               2.2                  1
24               0.1                0.5                0               2.4                  1
25               0.1                0.7                0               0.2                  0
26               0.1                0.7                0               0.4                  0
27               0.1                0.7                0               0.6                  0
28               0.1                0.7                0               0.8                  0
29               0.1                0.7                0               1.0                  1
30               0.1                0.7                0               1.2                  1
31               0.1                0.7                0               1.4                  1
32               0.1                0.7                0               1.6                  1
33               0.1                0.7                0               1.8                  3
34               0.1                0.7                0               2.0                 70
35               0.1                0.7                0               2.2                 87
36               0.1                0.7                0               2.4                 87
37               0.1                0.9                0               0.2                  0
38               0.1                0.9                0               0.4                  0
39               0.1                0.9                0               0.6                  0
40               0.1                0.9                0               0.8                  0
41               0.1                0.9                0               1.0                  1
42               0.1                0.9                0               1.2                  2
43               0.1                0.9                0               1.4                  2
44               0.1                0.9                0               1.6                  5
45               0.1                0.9                0               1.8                 86
46               0.1                0.9                0               2.0               1063
47               0.1                0.9                0               2.2               1269
48               0.1                0.9                0               2.4               1299

Thank you.

jiyunmaths commented 1 year ago

Hi @pigyun906 Seems there is a larger expression variation in your data. Please can you loosen cutoff_normal_range and cutoff_tumor_range to large values, for example cutoff_normal_range = c(0, 2.0), cutoff_tumor_range = c(0, 2.8)? You can visualize the expression variation in the tumor and normal samples using the function plot_sd, which is included in the DeMixT_preprocessing.R script: plot_sd(PRAD, label) and adjust the two cutoffs. Thanks.

pigyun906 commented 1 year ago

It works smoothly. Thanks.