RahmanTeam / DECoN

50 stars 29 forks source link

How to select meaningful CNVs? #46

Open thedam opened 1 year ago

thedam commented 1 year ago

Hey, After many pains I've successfully run DECoN on 208 WES samples. For each sample I've got like 300-1000 CNV. Most of them are them are rubbish. Is there any hint how should I filter the last cnv.calls_plot DF in order to refine the results?

     ID     sample correlation N.comp start.p end.p        type nexons     start       end chromosome                       id    BF reads.expected reads.observed reads.ratio     Gene start.b end.b  chr
1     1 s1   0.9927666     13       1     1    deletion      1     69037     70008          1         chr1:69037-70008  2.66            309            216       0.699    OR4F5       1     1 chr1
2     2 s1   0.9927666     13     380   380 duplication      1   1645126   1645262          1     chr1:1645126-1645262  3.25            193            263       1.360   CDK11B      11    11 chr1
3     3 s1   0.9927666     13     665   665 duplication      1   3469433   3469593          1     chr1:3469433-3469593  2.69            145            197       1.360 ARHGEF16       5     5 chr1
4     4 s1   0.9927666     13    1842  1843    deletion      2  12917693  12920413          1   chr1:12917693-12920413 28.60            936            584       0.624  PRAMEF7       1     2 chr1
5     5 s1   0.9927666     13    1846  1848    deletion      3  13049808  13053659          1   chr1:13049808-13053659 33.00           1679            995       0.593 PRAMEF27       1     3 chr1
6     6 s1   0.9927666     13    1851  1852 duplication      3  13175281  13198906          1   chr1:13175281-13198906 66.20           1019           2081       2.040  PRAMEF9       1     2 chr1
6.1   6 s1   0.9927666     13    1853  1853 duplication      3  13175281  13198906          1   chr1:13175281-13198906 66.20           1019           2081       2.040 PRAMEF13       1     1 chr1
7     7 s1   0.9927666     13    1856  1856    deletion      1  13260222  13260803          1   chr1:13260222-13260803 18.60            321            136       0.424  PRAMEF5       1     1 chr1
8     8 s1   0.9927666     13    2208  2208    deletion      1  16575157  16575229          1   chr1:16575157-16575229  4.29            440            300       0.682    NBPF1       4     4 chr1
9     9 s1   0.9927666     13    2217  2217    deletion      1  16587050  16587261          1   chr1:16587050-16587261 15.30            640            350       0.547    NBPF1      13    13 chr1
10   10 s1   0.9927666     13    2219  2221    deletion      3  16588849  16592021          1   chr1:16588849-16592021  8.34           1302            933       0.717    NBPF1      15    17 chr1
11   11 s1   0.9927666     13    2865  2865 duplication      1  21480051  21480223          1   chr1:21480051-21480223  8.55            225            354       1.570    NBPF3      10    10 chr1
12   12 s1   0.9927666     13    2865  2867    deletion      2  21480051  21481769          1   chr1:21480051-21481769  5.79            752            610       0.811    NBPF3      10    12 chr1
13   13 s1   0.9927666     13    3042  3042    deletion      1  22003003  22003088          1   chr1:22003003-22003088  5.81            169             98       0.580   CELA3A       2     2 chr1
14   14 s1   0.9927666     13    5325  5325 duplication      2  39763689  39763791          1   chr1:39763689-39763791  5.27            477            629       1.320     PPIE       2     2 chr1
14.1 14 s1   0.9927666     13    5326  5326 duplication      2  39763689  39763791          1   chr1:39763689-39763791  5.27            477            629       1.320    BMP8B       3     3 chr1
15   15 s1   0.9927666     13    5329  5329 duplication      1  39788152  39788485          1   chr1:39788152-39788485  2.93            442            574       1.300    BMP8B       1     1 chr1
16   16 s1   0.9927666     13    8814  8814    deletion      1  89010904  89011116          1   chr1:89010904-89011116  3.95            257            170       0.661     GBP3       4     4 chr1
17   17 s1   0.9927666     13    9677  9686 duplication     12 103571603 103618100          1 chr1:103571603-103618100 57.20           4654           6320       1.360    AMY2B       1    10 chr1
17.1 17 s1   0.9927666     13    9687  9688 duplication     12 103571603 103618100          1 chr1:103571603-103618100 57.20           4654           6320       1.360    AMY2A       1     2 chr1

I guess I don't need to provide 200 samples or more to get higher resolution, as DECoN takes only small subset of references? (like 13 on below picture)?

image