ZiyueYang01 / VirID

VirID: An integrated platform for the discovery and characterization of RNA Viruses
MIT License
11 stars 5 forks source link

Exception when directly invoking the `phylogenetic_analysis` stage #4

Open cerebis opened 4 months ago

cerebis commented 4 months ago

When using the phylogenetic_analysis stage directly with new data, the workflow attempts to execute anicalc.py. However, since earlier stages were not run, required input files (such as blast output) are not available. As a result, the script produces a None result.

The method anicalc.py:prune_alns() attempts to act on a None object and raises the following exception.

Traceback (most recent call last):
  File "/foo/bar/miniconda3/envs/VirID/lib/python3.10/site-packages/VirID/rvm/anicalc.py", line 105, in <module>
    alns = prune_alns(alns)
  File "/foo/bar/miniconda3/envs/VirID/lib/python3.10/site-packages/VirID/rvm/anicalc.py", line 48, in prune_alns
    alns = [aln for aln in alns if aln['len'] >= min_length and aln['evalue'] <= min_evalue]
TypeError: 'NoneType' object is not iterable

At present, this can be avoided by specifying --keep_dup on the commandline.

One approach to addressing this would be to invert the default CLI logic, making it instead --rm_dup. This would then allow users to invoke the last stage on contigs successfully, without adding critical caveats to your documentation. The reversal can then be dealt with internally for the normal end-to-end logic.

ZiyueYang01 commented 4 months ago

Firstly, the removal of redundancy step does not require pre-results, here we are performing a self-comparison.

'NoneType' object is not iterable This means that there are no redundant sequences,and VirID can continue to run.