bobeobibo / phigaro

Phigaro is a scalable command-line tool for predicting phages and prophages
MIT License
46 stars 15 forks source link

Error with a big multifasta file #21

Closed luayou closed 4 years ago

luayou commented 4 years ago

Hi there!

Thank you to develop this kind of software! I've already successfully got the results using the Bacillus genome that you provide here, but now I am trying to use it with a multifasta of ~600 bacterial genomes, and I am having the following error. I guess that it could be solved without asking for .html file, couldn't it? Moreover, the program made a folder called "proc", could I get the results from there?

Thanks again, Laura

phigaro -f NatBiotech2019.639genomes.fasta -o NatBiotech2019.phigaro.out -p --not-open -e html tsv gff bed stdout -t 12 -d aTraceback (most recent call last): File "/home/user/software/envs/phigaro_env/bin/phigaro", line 8, in sys.exit(main()) File "/home/user/software/envs/phigaro_env/lib/python3.7/site-packages/phigaro/cli/batch.py", line 210, in main task_output_file = run_tasks_chain(tasks) File "/home/user/software/envs/phigaro_env/lib/python3.7/site-packages/phigaro/batch/runner.py", line 20, in run_tasks_chain task.run() File "/home/user/software/envs/phigaro_env/lib/python3.7/site-packages/phigaro/batch/task/run_phigaro.py", line 283, in run plot_html(hmmer_records, begin, end) File "/home/user/software/envs/phigaro_env/lib/python3.7/site-packages/phigaro/to_html/preprocess.py", line 190, in plot_html widths = 1.0 * (np.array(widths) - min(widths)) / max(widths) + 0.1 ValueError: min() arg is an empty sequence

PollyTikhonova commented 4 years ago

Hello, I am very pleased that you use our tool. I corrected a defect, and released a new version 2.2.4-1 (2.2.4.post1). So you can update the package via pypi or git or just correct it locally according to my commit (you'll need to add just one line). And you will not face with this error again.
file phigaro/to_html/preprocess.py, line 190: if len(widths) > 0:

But answering your question, yes, excluding the html output should help. And yes, if you have non-empty proc folder you can recover the results from it, or you can use the hmmer and prodigal outputs to recompute the prophage regions and rebuild the html and other files. (using the option --substitute-output).

But, by default the proc folder is destroying after Phigaro finishes his work, so, if you want to save the proc folder you should include --no-cleanup

Also, I consider the situation you faced with this problem very atypical and I would appreciate if you can share the genome which caused the problem. So that we could investigate the situation more attentively and maybe, improve our tool. The genome/genomes I need should have an empty plot and should NOT have any genes or pVOGs. My email: tikhonova.polly@mail.ru

luayou commented 3 years ago

Thank you! Now is working without any issue. I want to suggest to give the fasta file with the predictions like one of the outputs. I got that from the html, but I rather get that directly from the server.