liaoherui / StrainScan

High-resolution strain-level microbiome composition analysis tool based on reference genomes and k-mers
https://microbiomejournal.biomedcentral.com/articles/10.1186/s40168-023-01615-w
MIT License
32 stars 4 forks source link

use a custom database to identify strains give IndexError: list index out of range #21

Open wsyjh opened 3 months ago

wsyjh commented 3 months ago

Hello!Thank you for the tools you developed. When I use a custom database to identify strains, it give these error: `terminate called after throwing an instance of 'std::runtime_error' what(): Unsupported format Aborted (core dumped) Failed to open input file 'temp_415d474aef2c11eea34abc97e1c3cf11.jf' rm: cannot remove 'temp_415d474aef2c11eea34abc97e1c3cf11.jf': No such file or directory


193: weak


parent node: 193 -> 194: weak 1: weak 1: 0.000000 | 0.000000 0


parent node: 194 -> 195: weak 2: weak 2: 0.000000 | 0.000000 0


Traceback (most recent call last): File "/home/data/t220324/miniconda3/envs/env4mamba/envs/strainscan/bin/strainscan", line 10, in sys.exit(main()) File "/home/data/t220324/miniconda3/envs/env4mamba/envs/strainscan/lib/python3.7/site-packages/StrainScan/StrainScan.py", line 196, in main cls_dict = identify_low_mem.identify_cluster(in_fq, db_dir + '/Tree_database', [0.1, 0.4, 1]) File "/home/data/t220324/miniconda3/envs/env4mamba/envs/strainscan/lib/python3.7/site-packages/StrainScan/library/identify_low_mem.py", line 408, in identify_cluster search(pending, match_results, db_dir, valid_kmers, length, cov, abundance, cov_cutoff, ab_cutoff, results, leaves, res_temp, tree, overlapping_info) File "/home/data/t220324/miniconda3/envs/env4mamba/envs/strainscan/lib/python3.7/site-packages/StrainScan/library/identify_low_mem.py", line 247, in search print("parent node: %d ->"%tree.parent(group[0].identifier).identifier) IndexError: list index out of range`

Everything seems to be fine during the custom database building process,and the version is latest,any ideas? Maybe it's the bug of the memory efficient mode?

liaoherui commented 3 months ago

Hi, Siyuan,

Thanks for using StrainScan. According to the log you provided, I noticed it says "what(): Unsupported format", which means the tool "jellyfish" cannot accept the sequencing reads format you input. May I know your full command to run the tool and your reads format? Also, you may consider not using memory efficient mode if you don't have too many reference genomes.

We will check whether this is a bug when you provide the above information. Thanks!

wsyjh commented 3 months ago

Hi, Siyuan,

Thanks for using StrainScan. According to the log you provided, I noticed it says "what(): Unsupported format", which means the tool "jellyfish" cannot accept the sequencing reads format you input. May I know your full command to run the tool and your reads format? Also, you may consider not using memory efficient mode if you don't have too many reference genomes.

We will check whether this is a bug when you provide the above information. Thanks!

I changed another custom database with normal mode for analysis, and there was no error. It should be a problem in the process of building the database with memory efficient mode.

liaoherui commented 3 months ago

Good to hear that! We will check the potential bug for memory efficient mode later. Thanks for your valuable feedback!