Closed ChristophKnapp closed 1 year ago
You are aligning sequences to an empty HMM. What are you expecting to happen?
Not a snappy answer and not a segmentation fault. Go away if you can't communicate.
On Tue, 4 Apr 2023, 19:10 Zachary Kurtz, @.***> wrote:
You are aligning sequences to an empty HMM. What are you expecting to happen?
— Reply to this email directly, view it on GitHub https://github.com/althonos/pyhmmer/issues/36#issuecomment-1496322060, or unsubscribe https://github.com/notifications/unsubscribe-auth/AA4MJIZAFUDCSPSIQAFXXETW7RIZ7ANCNFSM6AAAAAAWSUKQXA . You are receiving this because you authored the thread.Message ID: @.***>
I was simply asking if you expected that code to work or if you were just flagging this for better error handling...
Neither, as I said in the issue description, I'm just getting started. If you still learn, a segmentation fault leaves you helpless. Read my text properly. I even said that this could be entirely my fault. An answer like yours is neither polite nor helpful.
On Tue, 4 Apr 2023, 19:52 Zachary Kurtz, @.***> wrote:
I was simply asking if you expected that code to work or if you were just flagging this for better error handling...
— Reply to this email directly, view it on GitHub https://github.com/althonos/pyhmmer/issues/36#issuecomment-1496370246, or unsubscribe https://github.com/notifications/unsubscribe-auth/AA4MJIZVIXCQJKLSQ62EE3TW7RNVVANCNFSM6AAAAAAWSUKQXA . You are receiving this because you authored the thread.Message ID: @.***>
@ChristophKnapp I've re-read your comment several times and I still don't know what you need help with.
I agree the segfault is not ideal... perhaps it's coming from hmmer3, like the other issue you linked, and can't be fixed in the python code - but in that case the segfault could still be prevented in pyhmmer by raising an error when trying to use an empty HMM
object.
If you're trying to construct an HMM for use in an actual project, you either have to build an HMM from sequences yourself (see this example) or read an existing HMM from a file.
Hi @ChristophKnapp,
I understand where you could think Zachary's comment could come out at snappy, but he has always been very helpful and there is no reason to take it as such. It is common for less mature projects to have confusing error paths, and so it can be often the case that people will report issues where the actual issue report could be improved (for instance, the Rust language has an entire working group dedicated to improving errors).
To address the issue itself: indeed, what your are doing is not standard, and probably not something the HMMER CLI would have allowed you to do. hmmalign
(in HMMER) uses a HMM to align together several sequences by using the HMM as the reference. This can be really helpful when you are trying to align together several new members of a protein family, for instance. By manually creating an empty HMM, you enter a path not planned by HMMER, and that may be hard to know why you're getting a segfault. In any case, end-users like you should not get a segfault, so I will look into the code path so that this actually raises a proper error ahead of the crash. On your side, you should indeed follow Zachary's suggestion and either build an initial HMM from the input alignment, or load an external HMM to align to, if aligning your sequences to a reference HMM is what you intend to do.
You are all interpreting to much in my issue report. As I said before, I was getting started, wrote my first 4 lines of code and did expect problems with it. Than I got a "segmentation error". With a different error message I can work. I can reread the documentation and retry. Figure out myself where I got wrong. With what I got, I didn't know where to start.
On Wed, 5 Apr 2023, 00:24 Martin Larralde, @.***> wrote:
Hi @ChristophKnapp https://github.com/ChristophKnapp,
I understand where you could think Zachary's comment could come out at snappy, but he has always been very helpful and there is no reason to take it as such. It is common for less mature projects to have confusing error paths, and so it can be often the case that people will report issues where the actual issue report could be improved (for instance, the Rust language has an entire working group dedicated to improving errors).
To address the issue itself: indeed, what your are doing is not standard, and probably not something the HMMER CLI would have allowed you to do. hmmalign (in HMMER) uses a HMM to align together several sequences by using the HMM as the reference. This can be really helpful when you are trying to align together several new members of a protein family, for instance. By manually creating an empty HMM, you enter a path not planned by HMMER, and that may be hard to know why you're getting a segfault. In any case, end-users like you should not get a segfault, so I will look into the code path so that this actually raises a proper error ahead of the crash. On your side, you should indeed follow Zachary's suggestion and either build an initial HMM from the input alignment, or load an external HMM to align to, if aligning your sequences to a reference HMM is what you intend to do.
— Reply to this email directly, view it on GitHub https://github.com/althonos/pyhmmer/issues/36#issuecomment-1496677569, or unsubscribe https://github.com/notifications/unsubscribe-auth/AA4MJI7AU5C5CBH5FJYH4MDW7SNSXANCNFSM6AAAAAAWSUKQXA . You are receiving this because you were mentioned.Message ID: @.***>
This bug was caused by a combination of two things, both of which are now fixed in v0.7.4
:
TraceAligner
doesn't validate HMMs given in input, and the underlying HMMER code doesn't either, so it could crash on erroneous data, this was fixed in fe6b9165794811c3aba984a9b28879ec7a50b9cf.HMM
from Python didn't create a valid HMM, this was fixed in 002c68ea438802431e2e4db2cc24e80ec06a40ae so that HMM.__init__
uses arbitrary probabilities to ensure validity.
Hi there
This might be my fault (I'm just getting started with pyhmmer) or related to issue
new empty HMM segfaults when saved to file
but when I execute this code
I'm getting a "Segmentation fault (core dumped)".
Given that I would expect a meaningful error when my syntax wouldn't be correct, the only way I can get help is by letting you know.
In the pfam_IPR002213_top19.fasta file are the top 19 sequences of a much larger file.
Regards
Christoph