broadinstitute / ichorCNA

Estimating tumor fraction in cell-free DNA from ultra-low-pass whole genome sequencing.
GNU General Public License v3.0
164 stars 88 forks source link

Error when generating PoN #40

Closed andreykoch closed 5 years ago

andreykoch commented 6 years ago

Hello,

While trying to generate PoN, I'm getting an error.

I have previously generated 4 normal wig files using:

$readCounter --window 50000 --quality 20 --chromosome "chr1,chr2,chr3,chr4,chr5,chr6,chr7,chr8,chr9,chr10,chr11,chr12,chr13,chr14,chr15,chr16,chr17,chr18,chr19,chr20,chr21,chr22,chrX,chrY" $normal_sample.bam > $normal_sample.wig

Wig files have correct format:

fixedStep chrom=chr1 start=1 step=50000 span=50000 5 47 31

The actual command line for the PoN generation:

Rscript /rdseqdata/bin/ichorCNA-master/scripts/createPanelOfNormals.R --filelist /rndhome/akoch/SamBam_file_example/TMP/ref/wigPoNfiles.txt --gcWig /rdseqdata/bin/ichorCNA-master/inst/extdata/gc_hg19_50kb.wig --mapWig /rdseqdata/bin/ichorCNA-master/inst/extdata/map_hg19_50kb.wig --centromere /rdseqdata/bin/ichorCNA-master/inst/extdata/GRCh37.p13_centromere_UCSC-gapTable.txt --outfile /rdseqdata/bin/ichorCNA-master/inst/extdata/PoN_hg19_50Kb_median_normAutosome_median.rds

The output before blowup:

Loading normal file:/rndhome/akoch/SamBam_file_example/TMP/ref/1801100551-NIK-3-26_TCATCTCC_L003_R1_export.wig Slurping: /rndhome/akoch/SamBam_file_example/TMP/ref/1801100551-NIK-3-26_TCATCTCC_L003_R1_export.wig Parsing: fixedStep chrom=chr1 start=1 step=50000 span=50000 Parsing: fixedStep chrom=chr2 start=1 step=50000 span=50000 Parsing: fixedStep chrom=chr3 start=1 step=50000 span=50000 Parsing: fixedStep chrom=chr4 start=1 step=50000 span=50000 Parsing: fixedStep chrom=chr5 start=1 step=50000 span=50000 Parsing: fixedStep chrom=chr6 start=1 step=50000 span=50000 Parsing: fixedStep chrom=chr7 start=1 step=50000 span=50000 Parsing: fixedStep chrom=chr8 start=1 step=50000 span=50000 Parsing: fixedStep chrom=chr9 start=1 step=50000 span=50000 Parsing: fixedStep chrom=chr10 start=1 step=50000 span=50000 Parsing: fixedStep chrom=chr11 start=1 step=50000 span=50000 Parsing: fixedStep chrom=chr12 start=1 step=50000 span=50000 Parsing: fixedStep chrom=chr13 start=1 step=50000 span=50000 Parsing: fixedStep chrom=chr14 start=1 step=50000 span=50000 Parsing: fixedStep chrom=chr15 start=1 step=50000 span=50000 Parsing: fixedStep chrom=chr16 start=1 step=50000 span=50000 Parsing: fixedStep chrom=chr17 start=1 step=50000 span=50000 Parsing: fixedStep chrom=chr18 start=1 step=50000 span=50000 Parsing: fixedStep chrom=chr19 start=1 step=50000 span=50000 Parsing: fixedStep chrom=chr20 start=1 step=50000 span=50000 Parsing: fixedStep chrom=chr21 start=1 step=50000 span=50000 Parsing: fixedStep chrom=chr22 start=1 step=50000 span=50000 Parsing: fixedStep chrom=chrX start=1 step=50000 span=50000 Parsing: fixedStep chrom=chrY start=1 step=50000 span=50000 Sorting by decreasing chromosome size Reading GC and mappability files Slurping: /rdseqdata/bin/ichorCNA-master/inst/extdata/gc_hg19_50kb.wig Parsing: fixedStep chrom=1 start=1 step=50000 span=50000 Error in seq.default(from = as.integer(tokens[2]), by = as.integer(tokens[3]), : 'from' must be a finite number Calls: wigToRangedData -> seq -> seq.default Execution halted

Any help is greatly appreciated!

Andrey

gavinha commented 6 years ago

Hi @andreykoch

This is most likely due to the chromosome naming convention of the gc and map wig files. Your normal samples have UCSC ("chr") chromosome names while the gc_hg19_50kb.wig and map_hg19_50kb.wig provided in the inst/extdata/ directory uses NCBI (no "chr") convention.

There are 2 ways to solve this: 1) You can duplicate the gc_hg19_50kb.wig and map_hg19_50kb.wig files and change chromosome names to use UCSC format. There are only 24 lines to change. 2) We can improve the PoN creation script to handle this more directly. The runichorCNA.R script does this but we have not had the change to update the PoN creation script.

Hope this helps.

Best, Gavin

andreykoch commented 6 years ago

Hi @gavinha

Thank you for the hint. The first suggested way solved the problem.

Andrey

gavinha commented 5 years ago

Updated the PoN creation script to handle genome style chromosome naming.

https://github.com/broadinstitute/ichorCNA/commit/bc89d52dc5a02e77094e5ca770c691494aca0911