Closed mazzalab closed 9 months ago
Yes, the VCF file GATK creates is not sufficient. You use the GenomicsDB though if you installed the genomicsdb R package (see our Dockerfile).
$ Rscript $PURECN/NormalDB.R --out-dir $OUT_REF \
--coverage-files example_normal_coverages.list \
--normal-panel $GENOMICSDB-WORKSPACE-PATH/pon_db \
--genome hg19 \
--assay agilent_v6
Thanks Markus, what I was saying is that I actually used your docker image from dockerhub where I guess GenomicsDB is already installed and working. If it is the case, the only difference between my command line and your is this:
--coverage-files example_normal_coverages.list \
may this be the reason of the error?
If you prefer I can describe line-bu-line what I've done
The output of GATK CreateSomaticPanelOfNormals is not sufficient, so NormalDB.R does its own thing, similar to it. So don't provide that VCF output, provide the actual GenomicsDB directory, i.e. the output of GATK GenomicsDBImport.
It works, thanks. I suggest making it somehow clearer in the best practice document.
Following the GATK4 guidelines here https://gatk.broadinstitute.org/hc/en-us/articles/360035531132--How-to-Call-somatic-mutations-using-GATK4-Mutect2#:~:text=The%20three%20steps%20to%20create%20a%20panel%20of%20normals%20are%3A I've made my PON.
I've then tried to import it into PureCN with the command:
where $HORMAL_PANEL is my "pon.vcf.gz" file made above, but I get the following error:
FATAL [2023-12-10 22:35:04] The normal.panel.vcf.file contains only a single sample.
Even if the PON contains two and not one sample.
Note 1: I'm running PureCN in a Nextflow pipeline Note 2: I was not able to install PureCN on my custom Docker machine. Hence, I'm using this image (the latest):
https://hub.docker.com/r/markusriester/purecn
where I'm supposed to find GenomicsDB-R installed and properly working.Can you suggest how to make the function above working?
Complete Log Command error: