Closed yzhzu closed 3 years ago
Is it possible for you to share your genome files? I can't replicate the error using my test data.
I actually did reproduce the error. The workaround is to either update the LS-BSR repository and try again, or rerun with the "-z T" flag. Please let me know if this doesn't fix your error.
It works for using -z T. thank you very much.
Dear all: today, I try to use ls-bsr to perform several bacterial genomes. however, the error occurs, can someone help me? thank you very much! the following error information: python ~/biosoft/LS-BSR/ls_bsr.py -d ../KP_Zeng -i 0.8 -f T -p 40 -c cd-hit -b blastp -t T -e T LOG: 2021/07/02 17:37:31 - Testing paths of dependencies /home/anaconda3/envs/pgcgap/bin/blastp citation: Altschul SF, Madden TL, Schaffer AA, Zhang J, Zhang Z, Miller W, and Lipman DJ. 1997. Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic Acids Res 25:3389-3402 /home/anaconda3/envs/pgcgap/bin/prodigal citation: Hyatt D, Chen GL, Locascio PF, Land ML, Larimer FW, and Hauser LJ. 2010. Prodigal: prokaryotic gene recognition and translation initiation site identification. BMC Bioinformatics 11:119 /home/anaconda3/envs/pgcgap/bin/cd-hit citation: Li, W., Godzik, A. 2006. Cd-hit: a fast program for clustering and comparing large sets of protein or nuceltodie sequences. Bioinformatics 22(13):1658-1659 LOG: 2021/07/02 17:37:31 - predicting genes with Prodigal LOG: 2021/07/02 17:38:23 - Prodigal done LOG: 2021/07/02 17:38:23 - Converting genbank files LOG: 2021/07/02 17:40:11 - clustering with cd-hit at an ID of 0.8, length percentage of 0.9, using 40 processors Duplicate header IDs: 626_3 626_4 738_22 ... duplicate headers identified, renaming.. LOG: 2021/07/02 17:41:25 - starting blastp LOG: 2021/07/02 17:46:13 - BLAST done LOG: 2021/07/02 17:46:13 - Duplicate searching turned off LOG: 2021/07/02 17:46:15 - starting matrix building LOG: 2021/07/02 17:46:16 - The following genes had no hits in datasets or are too short, values changed to 0, check names and output:centroid_1530 centroid_2453 centroid_311 centroid_4813 centroid_5366 centroid_5638 centroid_6109 centroid_6695 LOG: 2021/07/02 17:46:16 - filtering duplicates Traceback (most recent call last): File "/home/biosoft/LS-BSR/ls_bsr.py", line 710, in
options.filter_scaffolds,options.prefix,options.intergenics,options.min_len,options.dup_toggle)
File "/home/biosoft/LS-BSR/ls_bsr.py", line 580, in main
num_filtered = filter_paralogs("%s/bsr_matrix_values.txt" % start_dir, "duplicate_ids.txt")
File "/home/biosoft/LS-BSR/ls_bsr/util.py", line 504, in filter_paralogs
with open(ids) as genomes_file:
FileNotFoundError: [Errno 2] No such file or directory: 'duplicate_ids.txt'