FunGeST / Palimpsest

An R package for studying mutational signatures and structural variant signatures along clonal evolution in cancer.
68 stars 19 forks source link

Signatures Contributing to Driver Genes #36

Open mkinnaman opened 4 years ago

mkinnaman commented 4 years ago

I keep throwing the same error every time I try to plot the drivers vs signatures plot:

matprob <- matrix(nrow=length(drivers),ncol=length(SBS_OSCE1),dimnames=list(drivers, SBS_OSCE1)) sig.cols <- paste0(rownames(SBS_OSCE1_sigs),".prob")#grep("prob",colnames(vcf.cod)) for(i in 1:nrow(matprob)){

  • g <- rownames(matprob)[i]
  • ind <- which(vcf.cod$gene_name==g)
  • matprob[i,] <- apply(vcf.cod[ind,sig.cols],2,sum,na.rm=T)
  • } Error in matprob[i, ] <- apply(vcf.cod[ind, sig.cols], 2, sum, na.rm = T) : number of items to replace is not a multiple of replacement length

Any thoughts?

mkinnaman commented 4 years ago

In addition - with my last couple of runs - have been throwing errors during the denovo signature command: Is this due to small sample size?

Error: NMF::nmf - invalid argument 'rank': must be a single numeric value In addition: Warning messages: 1: In (function (...) : NAs were produced due to errors in some of the runs: -#4[r=5] -> elements of 'k' must be between 1 and 4 [in call to 'cutree'] -#5[r=6] -> elements of 'k' must be between 1 and 4 [in call to 'cutree'] -#6[r=7] -> elements of 'k' must be between 1 and 4 [in call to 'cutree'] -#7[r=8] -> elements of 'k' must be between 1 and 4 [in call to 'cutree'] -#8[r=9] -> elements of 'k' must be between 1 and 4 [in call to 'cutree'] -#9[r=10] -> elements of 'k' must be between 1 and 4 [in call to 'cutree'] 2: Removed 25 rows containing missing values (geom_path). 3: Removed 62 rows containing missing values (geom_point). 4: In max(abs(diff(z))) : no non-missing arguments to max; returning -Inf

FunGeST commented 4 years ago

Hi,

Thanks for getting in touch, and I apologise for not getting back to you sooner.

Regarding your first issue, I think it may just be caused by SBS_OSCE1 and rownames(SBS_OSCE1_sigs) being different lengths, although without your data I can't be 100% sure.

I'm less sure about your second issue - out of interest how small is your sample size? NMF extractions should work for smaller sample sizes. To get around this error you could try specifying the num_of_sigs = argument in the NMF_Extraction() function as an integer.

If the file NMF_Rank_Estimates.pdf has been plotted in your results directory, the value in the x-axis for the first minima of the cophenetic plot is a good estimate for the optimal number of signatures in your input data. In this example, Palimpsest would choose 5 as the optimal number of signatures in the data.

Please let us know if this doesn't work and I'll try to find another solution!

Best wishes, Benedict