Wedge-lab / dpclust

Dirichlet Process based methods for subclonal reconstruction of tumours
GNU Affero General Public License v3.0
28 stars 16 forks source link

Error in apply(mutation.preferences, 1, max) #1

Closed lydiayliu closed 5 years ago

lydiayliu commented 5 years ago

Hi Stefan! This is Lydia :) (hope you are the one getting the message, or else it's a little awkward)

I'm been experiencing the following error when running DPClust. I run DPClust through the dpc.R script at the dpclust-smchet-docker github page. When I call DPClust:::DirichletProcessClustering I get the following error:

Error in apply(mutation.preferences, 1, max) : dim(X) must have a positive length Calls: <Anonymous> -> oneDimensionalClustering -> apply Execution halted

After investigating the function, I realized that this occurs when

tumour_optimaInfo.txt

has only 1 row that is not 0 in column 3 (no.of.mutations) Like this:

cluster.no location no.of.mutations 1 0.596868884540117 0 2 1.32093933463796 6300 3 2.47553816046967 0 4 2.56360078277886 0 5 2.69080234833659 0 6 2.80821917808219 0 7 2.89628180039139 0 8 3.29745596868885 0 9 3.41487279843444 0 10 3.57142857142857 0 11 3.73776908023483 0 12 3.85518590998043 0 13 3.93346379647749 0 14 4.20743639921722 0 15 4.29549902152642 0 16 4.8238747553816 0 17 4.83365949119374 0 18 4.853228962818 0 19 4.86301369863014 0 20 4.87279843444227 0 21 4.8825831702544 0 22 4.89236790606654 0 23 4.90215264187867 0 24 4.9119373776908 0 25 4.93150684931507 0 26 4.95107632093933 0

The error is due to the code in DPClust:::oneDimensionalClustering Line 99-100 # Obtain likelyhood of most likely cluster assignments most.likely.cluster.likelihood = apply(mutation.preferences, 1, max)

Because there is only 1 column of mutation.preferences that was selected through non_empty_clusters, it actually condenses down to a vector so apply no longer works. A simple fix would be to force mutation.preferences back into a data.table.

Also this error is not 100% reproducible. Since the Dirichlet process clustering is a stochastic process, sometimes the error goes away because there are mutations assigned to other local optimals.

Thanks, Lydia