althonos / pyhmmer

Cython bindings and Python interface to HMMER3.
https://pyhmmer.readthedocs.io
MIT License
128 stars 12 forks source link

BUG: `plan7.HMM.validate()` fails to raise exceptions with ill formatted hmm profiles #69

Open Sann5 opened 5 months ago

Sann5 commented 5 months ago

According to the documentation "HMMER", "NAME", "LENG", "ALPH", "HMM" are mandatory tags in the header of hmm files but I found that:

  1. omitting the "ALPH" line will generate a segmentation fault (not raise an exception), and
  2. omitting the "HMM" line will not raise an exception

It seems to me that this is an issue with the HMMER parser since this behaviour also arises when using the CLI, i.e.
hmmconvert ill_formated.hmm.

Here is the relevant code to reproduce the behavior (I'm using a MacbookPro with an Intel processor and MacOS Sonoma):

  1. Create an environment

    conda create --name test_pyhmmer bioconda::pyhmmer
    conda activate test_pyhmmer
  2. Download and unzip the attached hmm files and test script test_files_and_script.zip

  3. Try validating the hmm files with the attached script

    # Expected behavior: exception raised
    # Observed behavior: segmentation fault
    cd my_downloads_are_here
    ./validate_hmm_profiles.py no_alph.hmm
# Expected behavior: exception raised
# Observed behavior: no exception raised
./validate_hmm_profiles.py no_hmm.hmm

Just as a reference, this is what the validation script is doing...

from pyhmmer.plan7 import HMMFile
import sys

with HMMFile(sys.argv[1]) as hmm_file:
    for hmm_profile in hmm_file:
        hmm_profile.validate(tolerance=0.0001)
Sann5 commented 5 months ago

Small update. I emailed the Eddy Lab because of this bug. Sean said they have addressed both issues and that the changes will appear in the next HMMER3 release.