qunfengdong / BLCA

34 stars 12 forks source link

ValueError: max() arg is an empty sequence | blastn issue #28

Closed shump2 closed 3 years ago

shump2 commented 3 years ago

Hi, I have built a custom ncbi nt database and generated the associated taxonomy file. I successfully completed the test.fasta data on this database without issue. However, with my own data, I am getting the following error (note that I am using python3 and blast 2.12.0) :

**

clustalo is located in your PATH!

Fasta file read in! Reading in taxonomy information! .... blastn is located in your PATH! Running blast!! Blastn Finished!! read in blast file... blastn file opened blast output read in Start aligning reads... Traceback (most recent call last): File "2.blca_main.py", line 412, in outout.write(le + ":" + max(lexsum, key=lexsum.get) + ";" + str(max(lexsum.values())) + ";") ValueError: max() arg is an empty sequence**

The blast search runs fine but I noticed that the blastn aligment fails and outputs the above error and I am wondering what could be the issue:

The test.fasta.blastn output that works:

seq1 AE000782.1 100.000 470 0 0 1 470 1790478 1790009 0.0 869 470 minus 2178400 470 seq1 NR_074334.1 100.000 470 0 0 1 470 1 470 0.0 869 470 plus 1492 470 seq1 CP006577.1 99.787 470 1 0 1 470 1952565 1952096 0.0 863 467 minus 2316287 470 seq1 AB819341.1 99.786 468 1 0 3 470 1 468 0.0 859 465 plus 1487 470 seq1 EU573153.1 99.574 469 2 0 2 470 1 469 0.0 856 463 plus 934 470 seq1 FN356383.1 99.147 469 4 0 2 470 1 469 0.0 845 457 plus 1463 470 seq1 FN356382.1 99.147 469 4 0 2 470 1 469 0.0 845 457 plus 1465 470 seq1 FN356380.1 99.147 469 4 0 2 470 1 469 0.0 845 457 plus 1434 470 seq1 FN356423.1 98.934 469 5 0 2 470 1 469 0.0 839 454 plus 886 470 seq1 JF789483.1 99.563 458 0 2 15 470 1 458 0.0 833 451 plus 1328 470

My data blastn file that produces the error:

seq102 KX069362.1 100.000 471 0 0 1 471 188 658 0.0 815 441 plus 658 471 seq102 KX069361.1 100.000 471 0 0 1 471 188 658 0.0 815 441 plus 658 471 seq102 KF644040.1 100.000 471 0 0 1 471 188 658 0.0 815 441 plus 658 471 seq102 KF643756.1 99.788 471 1 0 1 471 188 658 0.0 809 438 plus 658 471 seq102 KF643376.1 99.788 471 1 0 1 471 188 658 0.0 809 438 plus 658 471 seq102 KF643271.1 99.788 471 1 0 1 471 188 658 0.0 809 438 plus 658 471 seq102 KF643895.1 100.000 465 0 0 1 465 188 652 0.0 804 435 plus 652 471 seq102 AB238459.1 99.363 471 3 0 1 471 188 658 0.0 798 432 plus 658 471 seq102 KF644122.1 100.000 419 0 0 1 419 188 606 0.0 774 419 plus 645 471 seq102 KF643711.1 100.000 419 0 0 1 419 188 606 0.0 774 419 plus 643 471

Using parameters -x and attempting -a muscle generates the errors regardless. I have got this to work in the past but recently I cannot get past this above erros. I would appreciate any advice on this matter. P

qunfengdong commented 3 years ago

Other users have reported this issue before. Sorry you will have to use Python 2 instead of Python 3.

shump2 commented 3 years ago

I have reformatted the custom database and it is now working smoothly!!!

lplough commented 2 years ago

I have reformatted the custom database and it is now working smoothly!!!

May I ask what you @shump2 did to get this working for your database? I am getting a similar python error with what appears to me to be a properly formaetted database/taxonomy file.

Louis