EddyRivasLab / easel

Sequence analysis library used by Eddy/Rivas lab code
Other
46 stars 26 forks source link

Add horizontal sum of float32x4 NEON vector #57

Closed althonos closed 3 years ago

althonos commented 3 years ago

Hi!

I started porting HMMER to the Aarch64 platform (althonos/hmmer-neon), and while doing so (using the SSE code as a reference) I noticed Easel was missing this function. Its SSE counterpart (esl_sse_hmax_ps) is used in the Viterbi score implementation.

This PR adds the esl_neon_hmax_f32 function, using either:

Both functions were tested on a Raspberry Pi 4.

cryptogenomicon commented 3 years ago

Thanks. This function isn't actually needed in HMMER but doesn't hurt to have it.

FYI, we already have a complete ARM port in H4 (h4-develop branch) and have been planning to bring that code back into H3 for the new Apple M1's. Nick Carter can give you more information.

althonos commented 3 years ago

Oh great !

I was able to get HMMER3 running (and passing the test suite) on Aarch64 with everything but the optimized SSV filter, I'll make a PR and perhaps Nick and I can discuss there. Cheers!

npcarter commented 3 years ago

Wow, it's great to hear that you've gotten so much done on the ARM port of H3. As Sean mentioned, we started working on that port when the M1 Macs were announced, but stopped when we discovered that using the Rosetta software to run the x86 version of HMMER only incurred a 5% or so performance penalty. However, enough people have had trouble getting HMMER to work under Rosetta that it'd be very useful to have a native ARM version.

If you've gotten everything but the SSV filter to work, then we should be pretty much done. The SSV filter isn't one of the parts of HMMER that changed between H3 and H4, so I can just grab the ARM version of that filter from the H4 code base. After that, the only remaining thing will be plugging everything into our build system.

Could you make your pull request to the h3-arm branch of our code base? That's where the existing ARM work for H3 is, and it'll be easier to debug/test there and then merge than to work in our main development branch.

-Nick

On Mon, Mar 15, 2021 at 8:29 AM Martin Larralde @.***> wrote:

Oh great !

I was able to get HMMER3 running (and passing the test suite) on Aarch64 with everything but the optimized SSV filter, I'll make a PR and perhaps Nick and I can discuss there. Cheers!

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/EddyRivasLab/easel/pull/57#issuecomment-799380259, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABDJBZAEM3YQJLXDVRJ2YD3TDX4THANCNFSM4ZFK732A .

althonos commented 3 years ago

Done !

However, enough people have had trouble getting HMMER to work under Rosetta that it'd be very useful to have a native ARM version.

Exactly, we are using HMMER3 to annotate with Pfam in one of our tools, and portability is more of an issue than performance, given we analyse small sequences and that some of our end-users are using the new Macbook as well.

npcarter commented 3 years ago

Great, thanks! I'll try to merge in the SSV code and start testing soon.

-Nick

On Tue, Mar 16, 2021 at 9:37 AM Martin Larralde @.***> wrote:

Done !

However, enough people have had trouble getting HMMER to work under Rosetta that it'd be very useful to have a native ARM version.

Exactly, we are using HMMER3 to annotate with Pfam in one of our tools, and portability is more of an issue than performance, given we analyse small sequences and that some of our end-users are using the new Macbook as well.

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/EddyRivasLab/easel/pull/57#issuecomment-800264628, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABDJBZBF2NQD6J22GW6OUPTTD5NJZANCNFSM4ZFK732A .

npcarter commented 3 years ago

Hello,

Would you be willing to create a second pull request for your ARM code? Because you started with h3-develop, there were a bunch of unrelated changes that had been added to h3-develop since I created h3-arm that made it harder to see what was your code and what was ours. I've updated h3-arm off of h3-develop, so if you create a new pull request, it should just be your stuff, which will be much easier to review.

Thanks,

-Nick

On Tue, Mar 16, 2021 at 9:37 AM Martin Larralde @.***> wrote:

Done !

However, enough people have had trouble getting HMMER to work under Rosetta that it'd be very useful to have a native ARM version.

Exactly, we are using HMMER3 to annotate with Pfam in one of our tools, and portability is more of an issue than performance, given we analyse small sequences and that some of our end-users are using the new Macbook as well.

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/EddyRivasLab/easel/pull/57#issuecomment-800264628, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABDJBZBF2NQD6J22GW6OUPTTD5NJZANCNFSM4ZFK732A .

althonos commented 3 years ago

Hi @npcarter : I rebased to h3-arm, you can check the new pull request hmmer#233.