leosanbu / pyngoST

pyngoST: multiple sequence typing of Neisseria gonorrhoeae for large assembly collections
GNU General Public License v3.0
2 stars 0 forks source link

KeyError: '2.008|13.003-1 #1

Closed zahidul-islam-nahid closed 2 months ago

zahidul-islam-nahid commented 3 months ago

I have run 24 N. gonorrhoeae sequences and getting the following error.

pyngoST.py -r fastafiles -s NG-STAR,MLST,NG-MAST -g -m -b -a -o output -t 16 -p /home____/pyngost/allelesDB/

pyngoST: multiple sequence typing of Neisseria gonorrhoeae for large assembly collections

Loading databases...

 Schemes requested: NG-STAR,MLST,NG-MAST,NG-MAST Genogroups

Number of processes: 16

 Finding matches...

Traceback (most recent call last): File "/home/nahid/tools/pyngost/lib/python3.11/site-packages/pyngoST/pyngoST.py", line 139, in finalresults, ngmastClusters = access_results(results, genogroups) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/nahid/tools/pyngost/lib/python3.11/site-packages/pyngoST/pyngoST_utils.py", line 750, in access_results for f, all_results in results: File "/home/nahid/anaconda3/lib/python3.11/concurrent/futures/_base.py", line 619, in result_iterator yield _result_or_cancel(fs.pop()) ^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/nahid/anaconda3/lib/python3.11/concurrent/futures/_base.py", line 317, in _result_or_cancel return fut.result(timeout) ^^^^^^^^^^^^^^^^^^^ File "/home/nahid/anaconda3/lib/python3.11/concurrent/futures/_base.py", line 449, in result return self.get_result() ^^^^^^^^^^^^^^^^^^^ File "/home/nahid/anaconda3/lib/python3.11/concurrent/futures/_base.py", line 401, in get_result raise self._exception File "/home/nahid/anaconda3/lib/python3.11/concurrent/futures/thread.py", line 58, in run result = self.fn(*self.args, **self.kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/nahid/tools/pyngost/lib/python3.11/site-packages/pyngoST/pyngoST_utils.py", line 737, in process_files st_list['NG-STAR'] += '\t'+penAmosaicsdic[penA]


KeyError: '2.008|13.003-1'
zahidul-islam-nahid commented 3 months ago

The issue is resolved. I have to omit the option -b which is for blasting new alleles to find the closest one.

zahidul-islam-nahid commented 3 months ago

I am also getting error for trying to get the genogroup of NG-MAST

pyngoST.py -i contigs/------.fasta -s NG-STAR,MLST,NG-MAST -g -o output_------ -t 16 -p /____/tools/pyngost/allelesDB/

pyngoST: multiple sequence typing of Neisseria gonorrhoeae for large assembly collections

Loading databases...

 Schemes requested: NG-STAR,MLST,NG-MAST,NG-MAST Genogroups

Number of processes: 16

 Finding matches...

 Calculating NG-MAST genogroups...

Traceback (most recent call last): File "/home/nahid/tools/pyngost/lib/python3.11/site-packages/pyngoST/pyngoST.py", line 142, in genopergenome = calculate_genogroups(out_path, PORout_results, TBPBout_results, ngmastClusters) if genogroups else None ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/nahid/tools/pyngost/lib/python3.11/site-packages/pyngoST/pyngoST_utils.py", line 567, in calculate_genogroups prepare_files_for_genogroups(out_path) File "/home/nahid/tools/pyngost/lib/python3.11/site-packages/pyngoST/pyngoST_utils.py", line 555, in prepare_files_for_genogroups align_sequences('POR_out.fas', 'POR_out.aln') File "/home/nahid/tools/pyngost/lib/python3.11/site-packages/pyngoST/pyngoST_utils.py", line 533, in align_sequences stdout, stderr = cline() ^^^^^^^ File "/home/nahid/tools/pyngost/lib/python3.11/site-packages/Bio/Application/init.py", line 592, in call raise ApplicationError(return_code, str(self), stdout_str, stderr_str) Bio.Application.ApplicationError: Non-zero return code 1 from 'muscle -in POR_out.fas -out POR_out.aln', message 'Invalid command line'

leosanbu commented 2 months ago

Hi, regarding your first comment: I have now updated pyngoST so it does not crash when trying to report mosaic penAs from novel alleles with a, i.e. '2.002-1', structure. It will see that 2.002 is a NonMosaic and report it as 'NonMosaic-like'.

leosanbu commented 2 months ago

On your second comment, i am not sure what the problem is, it works for me with test dataset. Do you have a working installation of 'muscle'?

Jolein-Laumen commented 1 month ago

Hi Leonor and Zahidul,

I got the same error when including the -g argument: _Bio.Application.ApplicationError: Non-zero return code 1 from 'muscle -in POR_out.fas -out POR_out.aln', message 'Invalid command line'

I found out this was caused because the newer muscle version I installed (5.1) uses the arguments -align and -output instead of -in and -out used in the current pyngoST script. The problem is solved by installing muscle version 3.8.1551. Maybe you could add a paragraph on the dependent packages and their versions (muscle==3.8.1551 and blast), or adapt the arguments in the script.

In addition, adding the -q argument to indicate the path used to save output files prevents the alignment of POR_out.fas and TBPB_out.fas and as a result does not save a report or print the table to the screen