Open meredith705 opened 2 years ago
Using fasta file from alternative sources will help you get rid of this.
I am also getting this error. But what is it about the fasta file thats causing this error? Any more information that can help debug this?
Thanks.
Did you figure this out @meredith705? Looks like we are doing something very similar.
@KatharineME @YuanfengZhang I have tried using a different version of GRCh38 and have still gotten the same error.
The other version is https://ftp-trace.ncbi.nlm.nih.gov/ReferenceSamples/giab/release/references/GRCh38/GCA_000001405.15_GRCh38_no_alt_analysis_set.fasta.gz
@KatharineME @YuanfengZhang I have tried using a different version of GRCh38 and have still gotten the same error.
The other version is https://ftp-trace.ncbi.nlm.nih.gov/ReferenceSamples/giab/release/references/GRCh38/GCA_000001405.15_GRCh38_no_alt_analysis_set.fasta.gz
(1) Try GDC/BWA/GATK/GA4GH/other sources. (2) Someone knowing cpp is need to handle this.
Hi All, I got hap.py working without changing the reference genome. I did the following. I'm assuming that you are using rtg as the vcf comparison engine:
1) Create genome sdf with rtg
rtg format -o path_to_sdf path_to_decompressed_reference_genome`
2) Get the pkrusche/hap.py docker container with latest
tag
3) Run the hap.py docker container like so
docker run --interactive --detach --tty --user root -v local_dir_with_all_needed_files:dir_in_container pkrusche/hap.py bash
4) Calling hap.py like so. Notice that the reference genome passed is decompressed. Remember that all paths in the command are paths in the container.
docker exec --interactive docker_container_id bash -c "/opt/hap.py/bin/hap.py path_to_truth_vcf path_to_query_vcf -f path_to_confident_regions_bed -r path_to_decompressed_reference_genome -o output_dir --engine-vcfeval-path path_to_rtg_folder --engine-vcfeval-template path_to_genome_sdf"
If this doesn't work for you, I recommend trying to run the examples from the readme in the docker container.
Hello,
I'm running the hap.py v0.3.12 docker using the Genome In A Bottle HG002 vs GRCh38 truth set using the files below and getting a segmentation fault error in the gvcf2bed program.
The GIAB bottle truth set is here https://ftp-trace.ncbi.nlm.nih.gov/ReferenceSamples/giab/release/AshkenazimTrio/HG002_NA24385_son/NISTv4.2.1/GRCh38/
The vcf I'm using as truth is:
HG002_GRCh38_1_22_v4.2.1_benchmark_phased_MHCassembly_StrandSeqANDTrio.vcf
from their supplementary files hereHigh confidence bed file:
HG002_GRCh38_1_22_v4.2.1_benchmark_noinconsistent.bed
file from the same directory as aboveThe GRCh38 reference is: https://ftp-trace.ncbi.nlm.nih.gov/ReferenceSamples/giab/release/references/GRCh38/GCA_000001405.15_GRCh38_no_alt_analysis_set_maskedGRC_exclusions_v2.fasta.gz
The query vcf I'm using is created by running dipcall on two haploid assemblies of chr11 ( a small test assembly ) using the same GRCh38 assembly as a reference:
paternal.dip.vcf.gz
Here is my command:
I get a segmentation fault error after about 2 minutes of run time.
I really appreciate your help figuring out this issue, please let me know what other information I can provide.
Thank you in advance, Melissa