Closed h170607 closed 1 year ago
That's weird. What's the output of Rscript /data/envsoft/R-4.0.0/lib64/R/library/PureCN/extdata/PureCN.R --help ?
There seems to be some problem:
/data/envsoft/R-4.0.0/bin/Rscript /data/envsoft/R-4.0.0/lib64/R/library/PureCN/extdata/PureCN.R --help
`Possible Ensembl SSL connectivity problems detected. Please see the 'Connection Troubleshooting' section of the biomaRt vignette vignette('accessing_ensembl', package = 'biomaRt')Error in curl::curl_fetch_memory(url, handle = handle) : Peer certificate cannot be authenticated with given CA certificates: [uswest.ensembl.org] Peer's Certificate has expired.
Usage: /data/envsoft/R-4.0.0/lib64/R/library/PureCN/extdata/PureCN.R [options] Options: -i SAMPLEID, --sampleid=SAMPLEID Sample id --normal=NORMAL Input: normal coverage, GC-normalized. Optional if normaldb or segfile is provided. --tumor=TUMOR Input: tumor coverage, GC-normalized --vcf=VCF Input: VCF file --rds=RDS Input: PureCN output RDS file, used to regenerate plots and files after manual curation --mappingbiasfile=MAPPINGBIASFILE Input: Mapping bias RDS file, generated by NormalDB.R (or calculateMappingBiasVcf) --normal_panel=NORMAL_PANEL Input: Deprecated, use --mappingbiasfile instead...`
need to reinstall this r package?
Yes, this looks like an ancient version. Try following the README here to get the most recent version.
I upgraded the software to the latest version 2.7.11, using BiocManager::install("lima1/PureCN")
,
and still have the following tips:
Possible Ensembl SSL connectivity problems detected.
Please see the 'Connection Troubleshooting' section of the biomaRt vignette
vignette('accessing_ensembl', package = 'biomaRt')Error in curl::curl_fetch_memory(url, handle = handle) :
Peer certificate cannot be authenticated with given CA certificates: [uswest.ensembl.org] Peer's Certificate has expired.`
I tried running it and still got errors:
WARN [2023-09-15 14:08:22] vcf.file has no DB info field for membership in germline databases. Found and used somatic status instead.
INFO [2023-09-15 14:08:22] 0 (0.0%) variants annotated as likely germline (DB INFO flag).
FATAL [2023-09-15 14:08:22] VCF either contains no germline variants or variants are not properly annotated.
My vcf file was generated by analyzing normal.bam and tumor.bam files with mutect2 and was filtered by the FilterMutectCalls. I don't quite understand why germline mutations are needed. Does this mean using the result of the HaplotypeCaller pipeline?
I am currently looking into this as well. Maybe you ran Mutect2 without --germline-resource, e.g. gnomAD? Mutect2 converts AF from germline resource to POPAF=-log10(AF), which PureCN can use for better fitting of copy numbers and estimates of purity/ploidy. PureCN can be run without --vcf if you want to do it without germline mutations, but I expect a loss of performance.
@h170607, have a look here, you will need to tell Mutect to get germline sites. https://bioconductor.org/packages/devel/bioc/vignettes/PureCN/inst/doc/Quick.html#3_Create_VCF_files
SNPs provide the allele-specific copy number signal, have a look at our or the ABSOLUTE paper.
We reran Mutect2 with --genotype-germline-sites true, --genotype-pon-sites true and --interval-padding 75, and got the result. It doesn't seem to need gnomAD.
Yes, if you have a matched normal, it uses that for the germline vs somatic priors.
If you get lots of variants removed by the BQ < 25 filter, see #320.
Hi, I Run PureCN with CNVkit segmentation:
cnvkit.py export seg ${SAMPLE}_cnvkit.cns --enumerate-chroms -o ${SAMPLE}_cnvkit.seg
`Rscript /data/envsoft/R-4.0.0/lib64/R/library/PureCN/extdata/PureCN.R --out ./${SAMPLE} \
The VCF contains matched normal information. Then got the error, with no results:
/data/envsoft/R-4.0.0/lib64/R/library/PureCN/extdata/PureCN.R: error: Error in getopt(spec = spec, opt = args) : long flag "seg-file" is invalid