dariober / cnv_facets

Somatic copy variant caller (CNV) for next generation sequencing
Other
67 stars 15 forks source link

Issues with install dependencies #37

Open umasaxena opened 3 years ago

umasaxena commented 3 years ago

Hello, I installed cnv_facts via bioconda and i seem to getting the desired parameter info via --help. However whilst running on bam files; I get the following error -

[2021-05-28 14:08:48] Loading file F00105616_BRCA.csv.gz...
Error in scan(file = file, what = what, sep = sep, quote = quote, dec = dec,  : 
  line 1 did not have 7 elements
Calls: data.table -> read.table -> scan
Execution halted

Should I manually install R package data.table in the conda environment. Any suggestion is appreciated. I am using R version 4.x.x version.

umasaxena commented 3 years ago

Just updating my previous query. I dont think it may be a dependency but more about the columns read by the script - cnv_facets.R on my tumor normal data generates this csv.gz file

Chromosome,Position,Ref,Alt,File1R,File1A,File1E,File1D,File2R,File2A,File2E,File2D
1,69372,.,.,1,0,0,0,0,0,0,0
1,69428,T,G,1,0,0,0,1,0,0,0
1,69590,T,A,0,0,0,0,2,0,0,0
1,69594,T,C,0,0,0,0,2,0,0,0
1,69610,C,T,0,0,0,0,1,0,0,0
1,69618,.,.,0,0,0,0,1,0,0,0
1,69635,G,C,1,0,0,0,1,0,0,0
1,69761,A,T,2,0,0,0,0,0,0,0
1,69808,G,C,1,0,0,0,0,0,0,0

This contains 12 columns and from the error it looks like it requires only 7 columns. Could you please let me know any solution to this issue. The way I run it, is as follows -

cnv_facets.R -N 14 -t $tumor_file -n $normal_file -vcf $sample_vcf -o $sample --gbuild hg19 --annotation $annot --targets $interval

Intervals is Broad.human.exome.b37.gatk.interval_list from gatk resource bundle

dariober commented 3 years ago

Unfortunately, R doesn't say which line caused the error. I suspect the problem is not with the pileup file but either with the annotation or the targets file. To check that, can you try in an R session, for the annotation file $annot:

library(data.table)
ann<- data.table(read.table(your-annotation-file, comment.char= '#', header= FALSE, sep= '\t', stringsAsFactors= FALSE, na.strings= ""))

and for the targets file $interval:

library(data.table)
targets<- data.table(read.table(xargs$targets, comment.char= '#', header= FALSE, sep= '\t'))

Also, have a look if these files are tab separated with complete lines.