genome / sciclone

An R package for inferring the subclonal architecture of tumors
Other
115 stars 55 forks source link

option 'binomial.bmm' not working #19

Closed nesilin closed 8 years ago

nesilin commented 8 years ago

Hi! I'm trying to compare different tools/methods to estimate clonality in tumors with TCGA data in order to see which samples have a clustering consensus between PyClone, Sciclone and probably Expands too. Not only methods but models too. Beta model as 'bmm' worked fine but I've tried to use 'binomial.bmm' model like this:

sc <- sciClone(vafs=vafs, copyNumberCalls=copyNumberCalls', regionsToExclude=NULL,
               sampleNames='sample_1', minimumDepth=10, clusterMethod="binomial.bmm",
               clusterParams='empty',
               cnCallsAreLog2=FALSE, useSexChrs=FALSE, doClustering=TRUE,
               verbose=TRUE, 
               copyNumberMargins=0.4, maximumClusters=20,
               annotation=NULL, doClusteringAlongMargins=TRUE,
               plotIntermediateResults=0)

And I got the following error:

[1] "checking input data..."
[1] "Not all variants fall within a provided copy number region. The copy number of these variants is assumed to be 2."
56 sites (of 83 original sites) are copy number neutral and have adequate depth in all samples
27 sites (of 83 original sites) were removed because of copy-number alterations
0 sites (of 83 original sites) were removed because of inadequate depth
27 sites (of 83 original sites) were removed because of copy-number alterations or inadequate depth
[1] "clustering..."
kmeans initialization:
V1
0.0519283013246183
0.142023395832538
0.336833578224629
0.419642857142857
0.733333333333333
0.251004016064257
0.296137404232378
0.15625
0.0676238184942296
0.286014546319517
0.212617983109786
0.0441841565047945
0.0937402190923318
0.13343949044586
0.127155172413793
0.380713489409142
0.266546587505349
0.2320029563932
0.166202620930319
0.325301204819277
Using threshold:  0.7 
lb decreased from -375.904937 to -550.464025!
lb decreased from -550.464025 to -1305714.532528!
lb decreased from -1305714.532528 to -1958467.476601!
lb decreased from -1958467.476601 to -4554357.319464!
lb decreased from -4554357.319464 to -5222001.150376!
lb decreased from -5222001.150376 to -5222019.306761!
Error in binomial.bmm.filter.clusters(vafs.merged, vafs, vars, total.trials,  : 
  Not implemented because std dev not implemented for binomial!

Here is an example of the data frames I'm using:

> vafs

   Chromosome  Position Ref_Count Alt_Count  Var_freq
1           1  37948064       273        15  5.208333
2           1  48771547        78         6  7.142857
3           1  55464936       121        43 26.219512
4           1  91818104        26         4 13.333333
5           1 100327202        57         4  6.557377
6           1 109439595       106         5  4.504505
7           1 145537822       136        21 13.375796
8           1 156255757       497        12  2.357564
9           1 157738319       240       158 39.698492
10          1 183520990        69         6  8.000000
11          1 210010489        39        10 20.408163
12          1 248004586       130        94 41.964286
13          2  25048948        43        26 37.681159
14          2  42990448        27         5 15.625000
15          2  99013095        64         7  9.859155
16          2 220047043       203       105 34.090909
17          2 220164096       168         7  4.000000
18          2 227985759       101        15 12.931034
19          2 238443205       124        42 25.301205
20          3  95374425       158        35 18.134715
> copyNumberCalls

   Chromosome     Start       End        cn
1           1   3218610  29121711 2.0734022
2           1  29123985  29125119 0.4408012
3           1  29131442  45626175 2.0797354
4           1  45634781  45652241 1.0721450
5           1  45665580  50697381 2.1361308
6           1  50699438  50701121 5.0925915
7           1  50702054  71793221 2.1779940
8           1  71793222  71796991 0.8512177
9           1  71797759  80464059 2.2256088
10          1  80467161  80469130 0.8779435
11          1  80470816  91418026 2.2192928
12          1  91419132  91427460 1.2402234
13          1  91429035  96476493 2.1854040
14          1  96476513  96476570 0.8682004
15          1  96478125 115974088 2.1914716
16          1 115985027 115986104 0.8019588
17          1 115990063 120527361 2.1860100
18          1 149879545 183765232 2.6667822
19          1 183765806 183765914 1.1654604
20          1 183771039 219399794 2.7109528

Thanks!

PS: To my knowledge PyClone do not use a minimum depth for clustering. Do you recommend to set the minimumDepth parameter to 0 in order to have better comparable results?

chrisamiller commented 8 years ago

Sorry for the slow response! Via my coauthor, @bswhite:

I think the problem is that Ines has set clusterParams='empty'. Instead, you want the default clusterParams="no.apply.overlapping.std.dev.condition"

without that, it looks like apply.overlapping.std.dev.condition will be TRUE in binomial.bmm.filter.clusters and this condition will be tripped

if(all(indices.to.keep==TRUE) & (apply.overlapping.std.dev.condition == TRU\

E) & (N.c > 1)) { stop("Not implemented because std dev not implemented for binomial!\n") } # End apply.overlapping.std.dev.condition

As for minimum depth, I don't recommend going below 50, unless absolutely necessary. For a striking example of how depth of coverage influences sampling error and the ability to discern clusters, take a look at Figure 5 here: http://www.cell.com/cell-systems/abstract/S2405-4712(15)00113-1

nesilin commented 8 years ago

Thanks! Now works!