B-UMMI / chewBBACA

BSR-Based Allele Calling Algorithm
GNU General Public License v3.0
134 stars 28 forks source link

AlleleCall error : updating hash tables #202

Closed LudoPoire closed 4 months ago

LudoPoire commented 4 months ago

Hi, I've been using your tool for quite a while (and I thank you so much for it!), but in my last run on 322 bacterial genomes I got an error for the first time which states that there's a problem while updating hash tables.

I created my schema using my own assemblies, then ran the following command : chewBBACA.py AlleleCall -i 'path to assemblies' -o AlleleCall --cpu 14 -g 'path to schema dir' --cds

It ran fine until the very end, where I got the following error :

Wrapping up

Creating file with genome coordinates profiles (results_contigsInfo.tsv)... Identifying paralogous loci and creating files with the list of paralogous loci (paralogous_counts.tsv & paralogous_loci.tsv)... Identified 176 paralogous loci. Assigning allele identifiers to inferred alleles... Assigned identifiers to 250666 new alleles for 13586 loci. Getting original sequence identifiers for new alleles... Getting data for new representative alleles... Adding the BLASTp self-score for the new representatives to Schema/short/self_scores Creating FASTA files with the new alleles... Adding new alleles to schema... Updating allele size mode values stored in Schema/loci_modes Updating pre-computed hash tables in Schema/pre_computed Traceback (most recent call last): File "/home/horigene/anaconda3/envs/chewie/bin/chewBBACA.py", line 10, in sys.exit(main()) File "/home/horigene/anaconda3/envs/chewie/lib/python3.10/site-packages/CHEWBBACA/chewBBACA.py", line 1505, in main functions_info[process][1]() File "/home/horigene/anaconda3/envs/chewie/lib/python3.10/site-packages/CHEWBBACA/utils/process_datetime.py", line 146, in wrapper func(*args, **kwargs) File "/home/horigene/anaconda3/envs/chewie/lib/python3.10/site-packages/CHEWBBACA/chewBBACA.py", line 507, in run_allele_call allele_call.main(genome_list, loci_list, args.schema_directory, File "/home/horigene/anaconda3/envs/chewie/lib/python3.10/site-packages/CHEWBBACA/AlleleCall/allele_call.py", line 2938, in main total_hashes = update_hash_tables(updated_novel, loci_to_call, File "/home/horigene/anaconda3/envs/chewie/lib/python3.10/site-packages/CHEWBBACA/AlleleCall/allele_call.py", line 236, in update_hash_tables latest_prot_table = sorted(prot_tables, IndexError: list index out of range

I was wondering if you could provide some help !

I'll send you more info about the files if needed.

Have a nice day !

LudoPoire commented 4 months ago

Update : I just checked the folder containing the Schema I created, and it doesn't contain the PROTEINtable file in "pre-computed". DNAtable1 is however present. I don't know what should be done to solve this.

I re-ran everything with another Schema that I downloaded from Chewie-NS and it ran fine.

rfm-targa commented 4 months ago

Hello @LudoPoire,

Sorry for the delay. It's good to know that you've already found a solution. Based on the information you shared, there might have been an issue during the creation of the hash tables, and the PROTEINtable file was not created. If you delete the folder that includes the DNAtable and PROTEINtable files, chewBBACA should create the pre_computed folder again, including the missing files. Another option is to run the PrepExternalSchema module to fix any issues within the schema and retry the allele calling. Let us know if you encounter this or any other issues.

Kind regards,

Rafael