DominikBuchner / BOLDigger-commandline

BOLDigger as a commandline tool
MIT License
8 stars 0 forks source link

error in boldblast_coi.py #1

Closed FabianRoger closed 2 years ago

FabianRoger commented 3 years ago

Hi,

I have just tried to use boldiggerfrom the command line but there seems to be an error when storing the results in a dataframe. See code below.

It seems clear that retrieving the data from bold works but appending results seems to fail.

boldigger-cline ie_coi "username" "password" /Users/fabian/Documents/01_Work/01_Research/15_eDNA_pilot/Data/COI_clustered_99.fasta /Users/fabian/Documents/01_Work/01_Research/15_eDNA_pilot/Data/boldiger

Login successfull
14:43:35: Requesting BOLD. This will take a while.                                                              
Downloading results: 100%||
29/29 [00:54<00:00,  1.88s/it]
14:45:03: Parsing html.                                                                                         
14:45:04: Saving results.                                                                                       
14:45:10: Removing finished OTUs from fasta.                                                                    
14:45:11: Requesting BOLD. This will take a while.                                                              
Downloading results: 0it [00:00, ?it/s]                                        | 1/75 [01:35<1:58:18, 95.93s/it]
14:45:12: Parsing html.                                                                                         
Requesting BOLD:   1%|▊                                                        | 1/75 [01:36<1:59:33, 96.93s/it]
Traceback (most recent call last):
  File "/Users/fabian/miniconda3/bin/boldigger-cline", line 8, in <module>
    sys.exit(main())
  File "/Users/fabian/miniconda3/lib/python3.8/site-packages/boldigger_cline/__main__.py", line 42, in main
    boldblast_coi.main(args.username, args.password, args.fasta_path, args.output_folder, args.batch_size)
  File "/Users/fabian/miniconda3/lib/python3.8/site-packages/boldigger_cline/boldblast_coi.py", line 67, in main
    dataframes = save_as_df(html_list, sequences_names[querys.index(query)])
  File "/Users/fabian/miniconda3/lib/python3.8/site-packages/boldigger/boldblast_coi.py", line 105, in save_as_df
    cols = dataframes[index].columns.tolist()
UnboundLocalError: local variable 'index' referenced before assignment

here is a reproducible example

cd ~/Downloads
curl -O https://raw.githubusercontent.com/DominikBuchner/BOLDigger-commandline/master/tests/COI.fasta

    # making a large enough file to exceed the batch size, forcing the program to append a second batch
cp COI.fasta COI1.fasta
for f in {1..5}; do cat COI1.fasta >> COI.fasta; done
awk '/^>/{print ">seq_" ++i; next}{print}' COI.fasta  > COI1.fasta

mkdir boldiger_out

boldigger-cline ie_coi "username" "password" COI1.fasta boldiger

edit

following the error message and putting Line 105 of boldblast_coi.py cols = dataframes[index].columns.tolist() into the four loop seems to solve the problem (program is running but hasn't completed yet)

DominikBuchner commented 3 years ago

I need the example file to reproduce this problem. Usually the index referenced before assignment error comes from a malformed fasta file that leads to a malformed response from the BOLD server.

FabianRoger commented 3 years ago

not sure what you mean, does the repex not run?

Also, as said in the edit, making the change fixed the problem for me.