broadinstitute / ichorCNA

Estimating tumor fraction in cell-free DNA from ultra-low-pass whole genome sequencing.
GNU General Public License v3.0
167 stars 88 forks source link

Missing bins on chr19 for hg38 but not hg19 #124

Open zztin opened 1 year ago

zztin commented 1 year ago

Hi,

I am running ichorCNA(commit:5bfc03e) on bam file mapped to hg38 (BAM file mapped to GRCh38Decoy) . I used the attached config file for the snakemake pipeline. GC and mappability files I took the onces in inst/extdata/ folder. However, the output ichorCNA prediction has much less bins than it should. While running it with hg19 settings (BAM file mapped to hs37d5), then it did not show the same behavior.

You could see from the SAMPLE.correctedDepth.txt and SAMPLE_correct.pdf file that there are only 2 bins left on chr19.

The configfiles are modified from snakemake repository on the github, and use the config file for hg19 and updated related files to hg38 version provided. The config_hg38.yaml provided is giving errors and seemed to miss quite some parameters. Therefore I used this customized version migrated from config.yaml.

HC_01_hg19.zip HC_01_hg38.zip config-ichorcna-hg19.yaml.txt config-ichorcna-hg38.yaml.txt

Maybe related information: In the Snakefile the first two rows are

configfile: config.yaml
configfile: sample.yaml

I modified them to respective config file according to genome version (instead of appended to the original). I expect this lead to a behavior that's cleaner/ more predictive.

Last but not least, Thank you for developing the tool!

Best, Li-Ting