broadinstitute / ichorCNA

Estimating tumor fraction in cell-free DNA from ultra-low-pass whole genome sequencing.
GNU General Public License v3.0
160 stars 87 forks source link

Format of exons bed file #13

Open chowbina opened 6 years ago

chowbina commented 6 years ago

Hello,

What is the format of the exons bed file as input to ichorCNA?

Thank you, Sudhir

jurhoades commented 6 years ago

Hi Sudhir,

The format should be BED format. ichorCNA expects a tab delimited file where the first column is the chromosome, the second is the start position, and the third is the end position. It also currently requires column headers, although it doesn't care about the exact naming. An example could look like this:

chr    start    stop
1      100      200
1      250      300
2      101      400
etc...

It should be ok if there are additional columns but ichorCNA will only look at the first three. Hope this helps!

chowbina commented 6 years ago

Hi Justin,

Thank you for the reply. This format is working and the command is not providing any error.

However, I do not see metrics specific to those exon regions that I have provided?

Here is the command that I used:

Rscript runIchorCNA.R --id tumor_sample --WIG $TUMOR_WIG --NORMWIG $NORMAL_WIG --ploidy "c(2,3)" --normal "c(0.5,0.6,0.7,0.8,0.9)" --maxCN 10 --gcWig $GC_WIG --mapWig $MAP_WIG --centromere $CENT_WIG --includeHOMD False --chrs "c(1:22)" --chrTrain "c(1:22)" --estimateNormal False --estimatePloidy False --estimateScPrevalence False --scStates "c(1,3)" --outDir $OUTDIR --exons.bed $EXONS_BED

tumor_sample.correctedDepth.txt still shows at intervals of 999999

chr start   end log2_TNratio_corrected
1   2000001 3000000 NA
1   8000001 9000000 0.154257665720098
1   9000001 10000000    -0.0836845239367425

Thank you, Sudhir

gavinha commented 6 years ago

Hi Sudhir,

The exon/targeted interval file is used to select (via overlap) the bins from the WIG files. The normalization approach used by ichorCNA requires equal sized, non-overlapping bins; therefore, it will not return results for the exon intervals.
You can try to use small bin sizes (e.g. 25kb or 50kb) to try to increase resolution of the exons.

Best, Gavin

chowbina commented 5 years ago

Hello, I ran the readcounter and ichorCNA command with 500kb intervals, used 500kb Gc and map file, tried bed file with window size of 50kb and 500kb; but could not get it to work. I get the following error:

Correcting for GC bias...
Error in simpleLoess(y, x, w, span, degree = degree, parametric = parametric,  :
  span is too small
Calls: loadReadCountsFromWig -> correctReadCounts -> loess -> simpleLoess
In addition: Warning message:
In .replace_seqlevels_style(x_seqlevels, value) :
  found more than one best sequence renaming map compatible with seqname style "NCBI" for this object, using the first one
Execution halted
ilykos commented 9 months ago

What is the main idea behind --exons.bed parameter? Is it used to make to narrow the analysis to exons in WES data? Or can it be used in WGS too?