Open mzwaig opened 1 year ago
Thanks for sharing this error. Unfortunately, the previous web links were just expired. Please download the reference and uniqueness files from zenodo: https://doi.org/10.5281/zenodo.7689958 I will update GitHub for this.
Thanks for sharing this error. Unfortunately, the previous web links were just expired. Please download the reference and uniqueness files from zenodo: https://doi.org/10.5281/zenodo.7689958
- Maizie (Xin) Zhou, Ph.D. Assistant Professor Biomedical Engineering, Computer Science, and Data Science Institute Vanderbilt University
5919 Stevenson Center
Nashville, TN 37235
Phone: 615-343-6843
https://lab.vanderbilt.edu/maizie-zhou-lab/https://lab.vanderbilt.edu/maizie-zhou-lab/ https://lab.vanderbilt.edu/maizie-zhou-lab/
From: mzwaig @.> Sent: Wednesday, March 1, 2023 7:58 AM To: maiziex/Aquila @.> Cc: Subscribed @.***> Subject: [maiziex/Aquila] Can't untar reference files (Issue #5)
Hi,
I'm trying to download the reference and uniqueness files to run Aquila but the files seem to be tar'ed HTML files which I can't unzip
Best, Melissa
(/lb/project/tools/conda/Aquila) Wed Mar 01 08:51:26 /lb/project/tools/Aquila $ tar xvf source.tar.gz gzip: stdin: not in gzip format tar: Child returned status 1 tar: Error is not recoverable: exiting now (/lb/project/tools/conda/Aquila) Wed Mar 01 08:51:32 /lb/project/tools/Aquila $ file source.tar.gz source.tar.gz: HTML document, ASCII text, with very long lines, with no line terminators (/lb/project/tools/conda/Aquila) Wed Mar 01 08:51:38 /lb/project/tools/Aquila $ file Uniqness_map.tar.gz Uniqness_map.tar.gz: HTML document, ASCII text, with very long lines, with no line terminators
— Reply to this email directly, view it on GitHubhttps://nam04.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fmaiziex%2FAquila%2Fissues%2F5&data=05%7C01%7Cmaizie.zhou%40vanderbilt.edu%7C8023df433f1645e312fd08db1a5cfb39%7Cba5a7f39e3be4ab3b45067fa80faecad%7C0%7C0%7C638132758878812464%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=NyOi8brOUzqJTPDK2y145zmlWXP33ffIEzQSfvJtiU8%3D&reserved=0, or unsubscribehttps://nam04.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fnotifications%2Funsubscribe-auth%2FABMOIQ46THTJUJUT4ZZ6MYLWZ5IW3ANCNFSM6AAAAAAVMCYEEY&data=05%7C01%7Cmaizie.zhou%40vanderbilt.edu%7C8023df433f1645e312fd08db1a5cfb39%7Cba5a7f39e3be4ab3b45067fa80faecad%7C0%7C0%7C638132758878812464%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=Tja2Iytxy3aH6WNk%2BAvNYR6CLZ7UulC%2FhZQSC%2BvEED4%3D&reserved=0. You are receiving this because you are subscribed to this thread.Message ID: @.***>
Thank you!
I'm having a second issue where I get this message when I try running the first step. My SNP calling is done with GATK not FreeBytes (still through LongRanger), could that be causing the issue?
Traceback (most recent call last): File "/lb/project/tools/Aquila/bin/Run_h5_all_multithreads.py", line 62, in Cal_snp_ratio_vs_depth AO_idx = _format.index("AO") ValueError: 'AO' is not in list
yes, sorry for the inconvenience. Aquila only accepts vcf file from freebayes right now.
Could I solve that by modifying the vcf parsing in Run_h5_all_multithreads.py or is the vcf used in other steps as well?
yes, you can just modify the vcf parsing in Run_h5_all_multithreads.py for GATK vcf format, no other steps. If you search the readme for "We now have a new version for step1 to use 1000 Genomes VCF as the input VCF file (please check here), and Aquila will use common variants from 1000G to help partition linked-reads. In the later version, Aquila will use Graph Genome Reference to replace Conventional Linear Reference." I added a python script "Run_h5_all_multithreads_GenRef.py" in the bin folder to parse 1000 Genomes VCF file. You can do the same thing for GATK vcf.
Great. Thanks!
Hi,
I've modified step 1 to run with the GATK output but I'm getting another error which I've included below. This is the first error message I get so I'm not sure why the files in results_phased_probmodel aren't being generated.
Thanks, Melissa
Traceback (most recent call last): File "/lb/project/ioannisr/Melissa-abacus/tools/Aquila/bin/Cut_phaseblock_for_phased_h5_v4.0_highconf_v2.py", line 277, in <module> Cut_phaseblock_for_phased_h5(file_name,chr_num,out_file,block_len_use,block_threshold,output_dir,bed_file,phase_block_file,global_track,HC_breakpoint_file,"xin") File "/lb/project/ioannisr/Melissa-abacus/tools/Aquila/bin/Cut_phaseblock_for_phased_h5_v4.0_highconf_v2.py", line 99, in Cut_phaseblock_for_phased_h5 f = open(h5_phased_file,"r") FileNotFoundError: [Errno 2] No such file or directory: '/lb/project/ioannisr/NOBACKUP/Melissa-nobackup/Luigi-Gen3G/Aquila/1003C/results_phased_probmodel/chr1.phased_final' [bam_sort_core] merging from 620 files and 20 in-memory blocks... [E::idx_find_and_load] Could not retrieve index file for '/lb/project/ioannisr/NOBACKUP/Melissa-nobackup/Luigi-Gen3G/Aquila/1003C/sorted_bam/sorted_bam.bam'
python Aquila/bin/Aquila_step0_sortbam.py --bam_file possorted_bam.bam --out_dir Assembly_results_S12878 --num_threads_for_samtools_sort 30
Can you run step0 first?
Hi,
It generated a sorted_bam.bam file but no index and I was unable to index it with samtools as well.
Thanks, Melissa
that's weird. this step only uses "samtools sort" (https://github.com/maiziex/Aquila/blob/master/bin/Aquila_step0_sortbam.py), you may want to check your input bam file and make sure it is not truncated.
Hi,
I'm trying to download the reference and uniqueness files to run Aquila but the files seem to be tar'ed HTML files which I can't unzip
Best, Melissa