EddyRivasLab / hmmer

HMMER: biological sequence analysis using profile HMMs
http://hmmer.org
Other
307 stars 69 forks source link

Adds -F option to hmmc2 to report hits as Stockholm format MSA for an… #272

Open foreveremain opened 2 years ago

foreveremain commented 2 years ago

PR is a result of discussions with @biomadeira

EBI Web services currently do not support download of the hits for an hmmsearch as an MSA. This PR adds a new '-F' flag to hmmc2 causing the hit alignment to be dumped to stdout as a stockholm file.

I can't reproduce the EBI setup so haven't yet been able to verify that hit metadata are propagated correctly, though it appears that sequence IDs (numeric ones) are certainly appearing in the output.

npcarter commented 2 years ago

Hello. Is this something that you're working on in conjunction with an EBI plan to add this feature to their HMMER server? Hmmc2 is a test program that we use internally, and isn't part of EBI's web server. They have a separate set of code that sends queries to hmmpgmd and passes the results to their database server to retrieve metadata and taxonomy information. Given that, we're not looking to add features to hmmc2 unless the EBI team would like them for testing purposes.

-Nick

On Tue, Mar 22, 2022 at 2:31 PM Jim Procter @.***> wrote:

PR is a result of discussions with @biomadeira https://github.com/biomadeira

EBI Web services currently do not support download of the hits for an hmmsearch as an MSA. This PR adds a new '-F' flag to hmmc2 causing the hit alignment to be dumped to stdout as a stockholm file.

I can't reproduce the EBI setup so haven't yet been able to verify that hit metadata are propagated correctly, though it appears that sequence IDs (numeric ones) are certainly appearing in the output.

You can view, comment on, or merge this pull request online at:

https://github.com/EddyRivasLab/hmmer/pull/272 Commit Summary

File Changes

(1 file https://github.com/EddyRivasLab/hmmer/pull/272/files)

Patch Links:

— Reply to this email directly, view it on GitHub https://github.com/EddyRivasLab/hmmer/pull/272, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABDJBZGQ7MVTJXZH7C6B6RDVBIGZDANCNFSM5RLWN7OA . You are receiving this because you are subscribed to this thread.Message ID: @.***>

biomadeira commented 2 years ago

Hi @npcarter! Just for a bit of context, this is related to the EBI tools framework (Job Dispatcher), which is not run by the HMMER team. In any case, we are running HMMER daemons and retrieving results with hmmc2 as instructed by them a few years back. Do you provide any "official" client to retrieve results or is it up to us to develop one?

npcarter commented 2 years ago

Ok, so if I understand you correctly, you're running your own HMMER daemons, separate from the set run by the HMMER web server team, and want to send requests to and receive results from them? That's going to be challenging, as hmmpgmd was never really intended to be used that way. The big problem is that hmmpgmd is designed to be integrated with EBI's taxonomy database, so all of the metadata is stripped out of the database that hmmpgmd sees and replaced with a unique ID that gets looked up in a separate metadata database, both to reduce the amount of memory used by hmmpgmd and to let EBI add information, like taxonomy hooks. (apologies if I'm telling you things you already knew)

The summary of all this is that we don't have anything better than hmmc2 in the HMMER package. The HMMER team at EBI has their own set of tools to retrieve and parse results from hmmpgmd.

I don't know if you know this, but Nicolo at EBI is working on a major revision of the EBI HMMER web tools, and one of the goals of that is to move the metadata back into the HMMER server and eliminate the need to have a separate metadata database.

-Nick

On Tue, Mar 29, 2022 at 7:24 AM Fábio Madeira @.***> wrote:

Hi @npcarter https://github.com/npcarter! Just for a bit of context, this is related to the EBI tools framework (Job Dispatcher), which is not run by the HMMER team. In any case, we are running HMMER daemons and retrieving results with hmmc2 as instructed by them a few years back. Do you provide any "official" client to retrieve results or is it up to us to develop one?

— Reply to this email directly, view it on GitHub https://github.com/EddyRivasLab/hmmer/pull/272#issuecomment-1081748832, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABDJBZA3VCYZ4QJFWXCD2R3VCLR5XANCNFSM5RLWN7OA . You are receiving this because you were mentioned.Message ID: @.***>

biomadeira commented 2 years ago

That helps! People come and go and I was not aware of some of the details. Yes, we are running our own HMMER daemons but were never intended to replace and/or compete with the HMMER team at EBI for this. Our collaboration with the HMMER team has been mostly down to the institute's requirement for having these tools presented and available via common APIs, which we (try to) provide.

Good to know Nicolo is working on a new iteration of the web tools. I will try to keep up with that and get in touch with him.

In relation to the PR, I can confirm Jim's changes compile and work as intended. As the scope of this feature is limited by the new additional flag I don't see any harm being caused by merging this in, but understand if you do not want to do it.

foreveremain commented 1 year ago

Hi Nick - perhaps tag Nicolo ? might make the cross-office comms a bit more direct ;)

On Tue, 29 Mar 2022 at 21:58, Nick Carter @.***> wrote:

Ok, so if I understand you correctly, you're running your own HMMER daemons, separate from the set run by the HMMER web server team, and want to send requests to and receive results from them? That's going to be challenging, as hmmpgmd was never really intended to be used that way. The big problem is that hmmpgmd is designed to be integrated with EBI's taxonomy database, so all of the metadata is stripped out of the database that hmmpgmd sees and replaced with a unique ID that gets looked up in a separate metadata database, both to reduce the amount of memory used by hmmpgmd and to let EBI add information, like taxonomy hooks. (apologies if I'm telling you things you already knew)

The summary of all this is that we don't have anything better than hmmc2 in the HMMER package. The HMMER team at EBI has their own set of tools to retrieve and parse results from hmmpgmd.

I don't know if you know this, but Nicolo at EBI is working on a major revision of the EBI HMMER web tools, and one of the goals of that is to move the metadata back into the HMMER server and eliminate the need to have a separate metadata database.

-Nick

On Tue, Mar 29, 2022 at 7:24 AM Fábio Madeira @.***> wrote:

Hi @npcarter https://github.com/npcarter! Just for a bit of context, this is related to the EBI tools framework (Job Dispatcher), which is not run by the HMMER team. In any case, we are running HMMER daemons and retrieving results with hmmc2 as instructed by them a few years back. Do you provide any "official" client to retrieve results or is it up to us to develop one?

— Reply to this email directly, view it on GitHub <https://github.com/EddyRivasLab/hmmer/pull/272#issuecomment-1081748832 , or unsubscribe < https://github.com/notifications/unsubscribe-auth/ABDJBZA3VCYZ4QJFWXCD2R3VCLR5XANCNFSM5RLWN7OA

. You are receiving this because you were mentioned.Message ID: @.***>

— Reply to this email directly, view it on GitHub https://github.com/EddyRivasLab/hmmer/pull/272#issuecomment-1082366605, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAJ7SW4ZMWDY63MKGUPVIBLVCNVGVANCNFSM5RLWN7OA . You are receiving this because you authored the thread.Message ID: @.***>