Open yue131 opened 3 weeks ago
I think you might have a broken version of PAV, can you pull v2.4.1.2
and run from there? I corrected some VCF problems in the latest version.
I am running the latest version just now (Oct-16-2024), and I had the sample issue apptainer run --bind ${PWD}:${PWD} library://becklab/pav/pav:latest -c 2
RuleException: RuntimeError in file /opt/pav/rules/vcf.snakefile, line 87: Unknown variant type: "SV" at index chr1-26639828-INV-8356 File "/opt/pav/rules/vcf.snakefile", line 87, in __rule_vcf_write_vcf File "/opt/pav/pavlib/vcf.py", line 198, in write_merged_vcf File "/opt/pav/dep/svpop/svpoplib/vcf.py", line 272, in ref_base [Wed Oct 16 11:29:34 2024] Error in rule vcf_write_vcf: jobid: 0 input: results/NA18500/bed_merged/pass/snv_snv.bed.gz, results/NA18500/bed_merged/pass/svindel_ins.bed.gz, results/NA18500/bed_merged/pass/svindel_del.bed.gz, results/NA18500/bed_merged/pass/sv_inv.bed.gz, results/NA18500/bed_merged/fail/snv_snv.bed.gz, results/NA18500/bed_merged/fail/svindel_ins.bed.gz, results/NA18500/bed_merged/fail/svindel_del.bed.gz, results/NA18500/bed_merged/fail/sv_inv.bed.gz, results/NA18500/bed_merged/pass/fa/svindel_ins.fa.gz, results/NA18500/bed_merged/pass/fa/svindel_del.fa.gz, results/NA18500/bed_merged/fail/fa/svindel_ins.fa.gz, results/NA18500/bed_merged/fail/fa/svindel_del.fa.gz, data/ref/contig_info.tsv.gz output: NA18500.vcf.gz
Exiting because a job execution failed. Look above for error message WorkflowError: At least one job did not complete successfully. [Wed Oct 16 11:29:35 2024] Error in rule vcf_write_vcf: jobid: 1 input: results/NA18500/bed_merged/pass/snv_snv.bed.gz, results/NA18500/bed_merged/pass/svindel_ins.bed.gz, results/NA18500/bed_merged/pass/svindel_del.bed.gz, results/NA18500/bed_merged/pass/sv_inv.bed.gz, results/NA18500/bed_merged/fail/snv_snv.bed.gz, results/NA18500/bed_merged/fail/svindel_ins.bed.gz, results/NA18500/bed_merged/fail/svindel_del.bed.gz, results/NA18500/bed_merged/fail/sv_inv.bed.gz, results/NA18500/bed_merged/pass/fa/svindel_ins.fa.gz, results/NA18500/bed_merged/pass/fa/svindel_del.fa.gz, results/NA18500/bed_merged/fail/fa/svindel_ins.fa.gz, results/NA18500/bed_merged/fail/fa/svindel_del.fa.gz, data/ref/contig_info.tsv.gz output: NA18500.vcf.gz
Exiting because a job execution failed. Look above for error message Complete log: .snakemake/log/2024-10-15T154641.291416.snakemake.log WorkflowError: At least one job did not complete successfully.
I also tried v2.4.1.2, and I got exactly the same error. "Unknown variant type: "SV" at index chr1-26639828-INV-8356"
Thanks. I'm running a test on 2.4.3
today.
I was able to replicate and fix the issue. I have just released 2.4.4
.
I was facing the exact same issue. Let me try with the new version. Thanks.
@paudano Simply using library://becklab/pav/pav:latest via singularity should work now? I see that the docker file is also updated with the new version ?
I tried running: singularity run --bind "$(pwd):$(pwd)" library://becklab/pav/pav:latest -c 32
I am getting this new error now: Building DAG of jobs... MissingInputException in rule call_callable_regions in file /opt/pav/rules/call.snakefile, line 189: Missing input files for rule call_callable_regions: output: results/GM24385-1-second/callable/callable_regions_h1_500.bed.gz wildcards: asm_name=GM24385-1-second, hap=h1, flank=500 affected files: results/GM24385-1-second/align/trim-tigref/aligned_query_h1.bed.gz temp/GM24385-1-second/lgsv/sv_ins_h1.bed.gz temp/GM24385-1-second/lgsv/sv_del_h1.bed.gz
It's an issue on the container build machine, there's something I should have cleared and did not. I'm rebuilding it now. I'll try running the built container before releasing.
thanks @paudano! Can you please let me know here when you are done with it?
Test is running now. The error is gone, but I want to make sure there are no more before releasing again.
Version 2.4.5
is up. I am simultaneously managing two versions of PAV while I'm making major improvements (not yet released), and it's breaking some of the processes I have setup to manage things. I'll be more careful about this. Thanks for your patience!
Thanks @paudano for working on this. I see that the error is gone now and I have been able to successfully run it till end using the latest version.
@paudano Just checking if there is a change in the VCF format with the latest version? I just encountered an error on using Truvari bench on the VCF produced by the latest version. This never happened with the VCFs produced by previous PAV version. Any insights on this would be helpful!
Here is the error: [W::bcf_hdr_register_hrec] The type "STR" is not supported, assuming "String" [W::bcf_hdr_register_hrec] The type "STR" is not supported, assuming "String"
[W::vcf_parse_filter] FILTER 'COMPOUND,TRIM' is not defined in the header
[E::bcf_hdr_parse_line] Could not parse the header line: "##FILTER=<ID=COMPOUND,TRIM,Description=\"Dummy\">"
[E::vcf_parse_filter] Could not add dummy header for FILTER 'COMPOUND,TRIM' at chr1:16725033
Traceback (most recent call last):
File "/projects/rsaju_prj/miniconda3/envs/truvari/bin/truvari", line 10, in
Yes, the VCF headers and FILTER column needs to be fixed. I'll be able to release a fix for this, and you can have PAV regenerate the VCFs with that fix (it won't need to re-run the whole sample). I am traveling today, but I'll get this corrected as soon as I can.
Pushed corrected version 2.4.6
.
With the new version, run PAV and add this to the end of the command:
-R vcf_write_vcf
That will re-create the VCFs without re-running the whole pipeline. The bad VCFs will be overwritten.
I usually test VCFs by running through bcftools view
, which I forgot to do with these changes. It is not generating any errors on my test set now.
Please let me know if you encounter any other issues.
Perfect and thanks @paudano
I just ran - singularity run --bind "$(pwd):$(pwd)" library://becklab/pav/pav:latest -c 32 -R vcf_write_vcf
But seems like it's running from the start? The directory already contains the output files.
INFO: Downloading library image Building DAG of jobs... Using shell: /bin/bash Provided cores: 32 Rules claiming more threads will be scaled down. Job stats: job count
align_depth_bed 2 align_get_read_bed 2 align_get_tig_fa 2 align_map 2 align_trim_tig 2 align_trim_tigref 2 call_callable_regions 2 call_cigar 20 call_cigar_merge 2 call_integrate_fail_fa 6 call_integrate_filter_redundant 8 call_integrate_sources 2 call_intersect_fail 8 call_intersect_fail_batch 160 call_inv_batch 120 call_inv_batch_merge 2 call_inv_cluster 4 call_inv_flag_insdel_cluster 4 call_inv_merge_flagged_loci 2 call_lg_discover 20 call_lg_split 2 call_merge_haplotypes 6 call_merge_haplotypes_batch 160 call_merge_haplotypes_snv 2 call_merge_lg 6 pav_all 1 vcf_write_vcf 1 total 550
Select jobs to execute... Execute 2 jobs...
There might be a step pulling something from temp files, which might need to be re-created. I've already addressed this in the next release so whole pipelines wouldn't need to be re-run. You can add --nt
no the command-line to keep temp files until this is resolved, then rm -r temp
after it's done. For now, however, you might have to re-run the pipeline.
hmm I actually had the temp folder in the directory while running it still could not pick it up. Anyways, I had re-run now. Thanks.
Hello, I encountered into some problems while using this native pav to call SV from assembly-hap and I tried many times and still couldn't make it. If you can give me some suggestions, I'd appreciate it! The commands and configuration files are listed below:
snakemake -s /share/path/Snakefile YCM.vcf.gz --cores 25
And part of run.log is as follows: The begain:
Building DAG of jobs...
Using shell: /usr/bin/bash Provided cores: 25 Rules claiming more threads will be scaled down. Job stats: job count min threads max threads
align_depth_bed 2 1 1 align_get_read_bed 2 1 1 align_get_tig_fa 2 1 1 align_map 2 24 24 align_trim_tig 2 1 1 align_trim_tigref 2 1 1 call_callable_regions 2 1 1 call_cigar 20 1 1 call_cigar_merge 2 1 1 call_integrate_fail_fa 6 1 1 call_integrate_filter_redundant 8 1 1 call_integrate_sources 2 1 1 call_intersect_fail 8 1 1 call_intersect_fail_batch 160 1 1 call_inv_batch 120 4 4 call_inv_batch_merge 2 1 1 call_inv_cluster 4 1 1 call_inv_flag_insdel_cluster 4 1 1 call_inv_merge_flagged_loci 2 1 1 call_lg_discover 20 12 12 call_lg_split 2 1 1 call_merge_batch_table 1 1 1 call_merge_haplotypes 6 1 1 call_merge_haplotypes_batch 160 12 12 call_merge_haplotypes_snv 2 1 1 call_merge_lg 6 1 1 data_align_ref 1 1 1 data_align_ref_anno_n_gap 1 1 1 data_ref_contig_table 1 1 1 vcf_write_vcf 1 1 1 total 553 1 24
Select jobs to execute...
Error: Building DAG of jobs... Using shell: /usr/bin/bash Provided cores: 25 Rules claiming more threads will be scaled down. Provided resources: mem_mb=24576, mem_mib=23438, disk_mb=1000, disk_mib=954 Select jobs to execute... [Tue Oct 1 13:28:01 2024] Error in rule vcf_write_vcf: jobid: 0 input: results/YCM/bed_merged/pass/snv_snv.bed.gz, results/YCM/bed_merged/pass/svindel_ins.bed.gz, results/YCM/bed_merged/pass/svindel_del.bed.gz, results/YCM/bed_merged/pass/sv_inv.bed.gz, results/YCM/bed_merged/fail/snv_snv.bed.gz, results/YCM/bed_merged/fail/svindel_ins.bed.gz, results/YCM/bed_merged/fail/svindel_del.bed.gz, results/YCM/bed_merged/fail/sv_inv.bed.gz, results/YCM/bed_merged/pass/fa/svindel_ins.fa.gz, results/YCM/bed_merged/pass/fa/svindel_del.fa.gz, results/YCM/bed_merged/fail/fa/svindel_ins.fa.gz, results/YCM/bed_merged/fail/fa/svindel_del.fa.gz, data/ref/contig_info.tsv.gz output: YCM.vcf.gz
The final: RuleException: RuntimeError in file /share/home/yuejw/software/PAV/pav/rules/vcf.snakefile, line 92: Unknown variant type: "SV" at index chr1-16394892-INV-161108 File "/share/home/yuejw/software/PAV/pav/rules/vcf.snakefile", line 92, in __rule_vcf_write_vcf File "/share/home/yuejw/software/PAV/pav/pavlib/vcf.py", line 198, in write_merged_vcf File "/share/home/yuejw/software/PAV/pav/dep/svpop/svpoplib/vcf.py", line 272, in ref_base File "/share/home/yuejw/miniconda3/envs/HiFi/lib/python3.7/concurrent/futures/thread.py", line 57, in run Shutting down, this might take some time. Exiting because a job execution failed. Look above for error message Exiting because a job execution failed. Look above for error message
Thanks!