voutcn / megahit

Ultra-fast and memory-efficient (meta-)genome assembler
http://www.ncbi.nlm.nih.gov/pubmed/25609793
GNU General Public License v3.0
588 stars 134 forks source link

Megahit in Arm Neoverse de 64 bits core #320

Open higaredavm opened 2 years ago

higaredavm commented 2 years ago

We are exploring bioinformtics in amazon AWS, specifically we want to run genome assembly on AWS Graviton2 processors which have an ARM architecture. Is there a version of megahit built for ARM / aarch64 architecture.

aclum commented 2 years ago

We have had good success with graviton2 with other bioinformatics tools and would also be interested of a version of megathit which compiles on ARM.

higaredavm commented 2 years ago

We have had good success with graviton2 with other bioinformatics tools and would also be interested of a version of megathit which compiles on ARM.

I am curious about what bioinformatics tools have you been running on graviton2.

aclum commented 2 years ago

From BBTools, BBMap and BBcms both run on graviton2. These are AWS blog posts on peregrine and bwa on graviton2 https://aws.amazon.com/blogs/publicsector/accelerating-genome-assembly-aws-graviton2/ https://aws.amazon.com/blogs/publicsector/generalized-approach-benchmarking-genomics-workloads-cloud-bwa-read-aligner-graviton2/

aclum commented 2 years ago

I made some progress on getting MEGAHIT working on ARM but am stalled on the third issue

The first issue is with x86 specific flags (-mbmi2 & -mpopcnt) which can be removed from the CMakeLists.txt file.

The second issue is with x86intrin.h and can be fixed by replacing with openvec*.h files from https://github.com/OpenVec/OpenVec

The third issue is an error about "impossible constraint in ‘asm’". Based on searching stack overflow this can be fixed by rewriting the code in c++ instead of assembly https://stackoverflow.com/questions/1478513/linux-assembler-error-impossible-constraint-in-asm

Would it be possible to get the third issue fixed?

voutcn commented 2 years ago

You can just replace https://github.com/voutcn/megahit/blob/7dde1cae4dfa8ced0a9bd524894df36e9cd4185b/src/utils/cpu_dispatch.h#L8-L34 with

inline bool HasPopcnt() { return false; }
inline bool HasBmi2() { return false; }

I don't have an Arm machine to test. Let me know if that works. Also feel free to send a PR.

@aquaskyline to follow up.

accopeland commented 2 years ago

For the record, the above suggestions are sufficient to compile megahit on graviton2. I am working on a PR.

higaredavm commented 2 years ago

@voutcn We are doing the changes that you suggest, however we are having this error. I really appreciate your help, thanks image-2

voutcn commented 2 years ago

I think the error has nothing to do with ARM. One *.edges.info file is broken for some unknown reason. Could you retry?

martin-g commented 5 months ago

Hi!

I created a new PR - https://github.com/voutcn/megahit/pull/368 It is built on top of #329 but preserves the x86_64 optimizations.