chrisamiller / fishplot

Create timecourse "fish plots" that show changes in the clonal architecture of tumors
Other
164 stars 46 forks source link

Serious bug #4

Closed ericpeny closed 7 years ago

ericpeny commented 7 years ago

I got a bug like this for several projects.

Error in if (yst > 85 | yst < 15) { : missing value where TRUE/FALSE needed

Is there any way to fix this?

chrisamiller commented 7 years ago

Happy to take a look, but I'm going to need more info in order to reproduce the problem. Can you provide example code that triggers the error?

ericpeny commented 7 years ago

Thank you for your reply! I like your fish plot package very much and found it extremely useful and easy to use!

I attach one of the data that can trigger such error. By the way, there is another error when using for loop to generate several fish plots which I encountered every time. So generally, I only plot the first model.

Thank you again. Best, Yu

On May 26, 2017 at 2:08:01 AM, Chris Miller (notifications@github.commailto:notifications@github.com) wrote:

Happy to take a look, but I'm going to need more info in order to reproduce the problem. Can you provide example code that triggers the error?

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHubhttps://github.com/chrisamiller/fishplot/issues/4#issuecomment-304081089, or mute the threadhttps://github.com/notifications/unsubscribe-auth/AXenW7VPZVWy6fm7ODzWJMHXpvOMYblBks5r9cN7gaJpZM4NmOlY.

chrisamiller commented 7 years ago

Nothing came through on the attachments. You're going to have to come to the gitHub site and paste your code in here.

pengyuguai commented 7 years ago

My data:

mutation_id sample_id cluster_id cellular_prevalence cellular_prevalence_std variant_allele_frequency 17_7576852 pre 1 0.118187361442 0.0371124574781 0.0703296703297 17_7576852 post 1 0.384729413268 0.116868376887 0.23629489603 17_7576852 baseline 1 0.137392128068 0.048717093665 0.0819672131148 2_29432682 pre 4 0.00322732926268 0.00600441751744 0.0 2_29432682 post 4 0.0158052933077 0.0235465191293 0.00532859680284 2_29432682 baseline 4 0.00815564692476 0.00965747091265 0.0 2_29443631 pre 3 0.00418149419492 0.00869147203052 0.0 2_29443631 post 3 0.171021194911 0.0566809939796 0.106666666667 2_29443631 baseline 3 0.0041901210051 0.00490413704689 0.0 2_42544448 pre 2 0.11227247241 0.0355450044597 0.0681647940075 2_42544448 post 2 0.286969070842 0.0803700908702 0.168112798265 2_42544448 baseline 2 0.137774068093 0.0469561838298 0.0804597701149 7_55233037 pre 3 0.0638062247272 0.0234438676448 0.0413555427915 7_55233037 post 3 0.17255135509 0.0555622955987 0.107033639144 7_55233037 baseline 3 0.00436017153106 0.00524178601463 0.0 9_21974106 pre 3 0.0330782138088 0.0109540486973 0.0197044334975 9_21974106 post 3 0.118071374965 0.0365014801854 0.0693069306931 9_21974106 baseline 3 0.0403576624203 0.0169437299164 0.0243309002433

Code:

library(devtools) library(clonevol) library(fishplot) library(stringr)

df <- read.csv('tables/loci.tsv', header = T, stringsAsFactors = F, sep = '\t')

final_df <- data.frame(cluster = df$cluster_id, Baseline.ccf = df[df$sample_id == 'baseline', ]$cellular_prevalence 100, Pre.vaf = df[df$sample_id == 'pre', ]$cellular_prevalence 100, Post.vaf = df[df$sample_id == 'post', ]$cellular_prevalence * 100)

vaf.col.names <- grep(".vaf", colnames(final_df), value=TRUE)

x <- infer.clonal.models(variants = final_df, cluster.col.name = "cluster", vaf.col.names = vaf.col.names, subclonal.test = "bootstrap", subclonal.test.model = "non-parametric", cluster.center = "mean", num.boots = 1000, founding.cluster = NULL, min.cluster.vaf = 0, p.value.cutoff = 0.4, model = 'monoclonal')

plot.clonal.models(x$models, matched=x$matched, variants=final_df, clone.shape="bell", box.plot=TRUE, out.format="pdf", overwrite.output=TRUE, scale.monoclonal.cell.frac=TRUE, cell.frac.ci=TRUE, tree.node.shape="circle", tree.node.size=40, tree.node.text.size=0.65, width=11, height=5, out.dir="output")

f = generateFishplotInputs(results=x) fishes = createFishPlotObjects(f)

pdf('fish.pdf', width=15, height=50) for (i in 1:length(fishes)){ fish = layoutClones(fishes[[i]]) fish = setCol(fish,f$clonevol.clone.colors) fishPlot(fish,shape="spline", vlines=seq(1, length(vaf.col.names)), vlab=vaf.col.names) } dev.off()

chrisamiller commented 7 years ago

If I've copied your code correctly, your input data frame looks like this, which does not make any sense. The points that you have assigned to cluster 1 span several distinct vaf ranges.


   cluster Baseline.ccf    Pre.vaf  Post.vaf
1        1   13.7392128 11.8187361 38.472941
2        1    0.8155647  0.3227329  1.580529
3        1    0.4190121  0.4181494 17.102119
4        4   13.7774068 11.2272472 28.696907
5        4    0.4360172  6.3806225 17.255136
6        4    4.0357662  3.3078214 11.807137
7        3   13.7392128 11.8187361 38.472941
8        3    0.8155647  0.3227329  1.580529
9        3    0.4190121  0.4181494 17.102119
10       2   13.7774068 11.2272472 28.696907
11       2    0.4360172  6.3806225 17.255136
12       2    4.0357662  3.3078214 11.807137
13       3   13.7392128 11.8187361 38.472941
14       3    0.8155647  0.3227329  1.580529
15       3    0.4190121  0.4181494 17.102119
16       3   13.7774068 11.2272472 28.696907
17       3    0.4360172  6.3806225 17.255136
18       3    4.0357662  3.3078214 11.807137```
pengyuguai commented 7 years ago

The data frame you got is correct, it is the cancer cell fraction (please ignore the header stating vaf.) Below is the vaf data frame of these three stages. I got these data from cyclone. It spawned for a large range may be because I got several positions having no position at some stage. So I assumed 0 depth for alt and true depth for ref.

final_df cluster Baseline.vaf Pre.vaf Post.vaf 1 1 8.196721 7.032967 23.6294896 2 1 0.000000 0.000000 0.5328597 3 1 0.000000 0.000000 10.6666667 4 4 8.045977 6.816479 16.8112798 5 4 0.000000 4.135554 10.7033639 6 4 2.433090 1.970443 6.9306931 7 3 8.196721 7.032967 23.6294896 8 3 0.000000 0.000000 0.5328597 9 3 0.000000 0.000000 10.6666667 10 2 8.045977 6.816479 16.8112798 11 2 0.000000 4.135554 10.7033639 12 2 2.433090 1.970443 6.9306931 13 3 8.196721 7.032967 23.6294896 14 3 0.000000 0.000000 0.5328597 15 3 0.000000 0.000000 10.6666667 16 3 8.045977 6.816479 16.8112798 17 3 0.000000 4.135554 10.7033639 18 3 2.433090 1.970443 6.9306931

chrisamiller commented 7 years ago

Why would these two points be in different clusters, despite having identical values?

1   13.7392128 11.8187361 38.472941
3   13.7392128 11.8187361 38.472941

to be clear, the input to clonevol is expected to be every input variant, one per line, with cluster assignments and VAFs for each. No variant should appear more than once, and every point belonging to the same cluster should have the same cluster number.

pengyuguai commented 7 years ago

Thank you for your reply! I now know where the problem is. I did not generate the data frame correctly. After I used the correct one, fish plot was generated successfully. Thank you very much.

chrisamiller commented 7 years ago

Glad to hear it! Resolving.

haoecust commented 6 years ago

@pengyuguai I got the same error with you. like 'Error in if (yst > 85 | yst < 15) { : missing value where TRUE/FALSE needed'. How did you fixed it.