databio / gtars

Performance-critical tools to manipulate, analyze, and process genomic interval data. Primarily focused on building tools for geniml - our genomic machine learning python package.
2 stars 1 forks source link

Python bindings wont build for linux `aarch64` #18

Closed nleroy917 closed 1 month ago

nleroy917 commented 2 months ago

With the latest release... the reqwest crate got added such that we can download pre-trained model universes directly inside from within Rust:

use genimtools::tokenizers::TreeTokenzer;

let tokenizer = TreeTokenizer::from_pretrained("databio/r2v-luecken2021-hg38-v2");

reqwest requires openssl to be installed on the actual machine that is building the rust package/crate. This caused the GitHub action builds to start failing. I initially thought that installing openssl headers in the GitHub runner would solve the issue, but that was not sufficient. It turns out that the GitHub runner (a docker container) is downloading another docker container to actually build the wheels. This is all orchestrated through the maturin github action. Thus, we need to install openssl headers inside the secondary docker container.

Luckily for us, the maturin action lets us run scripts inside that very container prior to running the build through the before-script-linux command. So I added a yum -y install openssl-devel command.

This was not sufficient... We are building for several architectures (x86 and aarch64). As it turns out, the subsequent inner docker containers pulled for these use separate package managers. So, while it might work for one, it will fail for the other because, command yum not found.

I enhanced my script a bit to check for a package manager before trying to use it:

# Check for yum and use it if available, otherwise use apt-get
if command -v yum &> /dev/null; then
  echo "Detected yum package manager"
  yum -y install openssl-devel
elif command -v apt-get &> /dev/null; then
  echo "Detected apt-get package manager"
  apt-get update
  echo "Installing libssl-dev pkg-config openssl musl-dev"
  apt-get install -y libssl-dev pkg-config openssl musl-dev
else
  echo "No supported package manager found (yum or apt-get)"
  exit 1
fi

This got the openssl hedaers to install correctly inside the correct containers... great. However there is one final issue: The aarch64 container is actually just x86 trying to cross-compile to aarch64. So, installing openssl just installs the x86 version and then during cross-compilation the header for aarch64 is still missing.

I am stuck here... taking a break since I've wasted so many hours on this. Luckily I am not too alone as someone else has had this exact issue before on stack overflow, but I have less control than them over the compilation since I am restricted to single bash commands.

Will resume later...

nleroy917 commented 2 months ago

One option seems to be that we can enable the "vendored" feature for openssl-sys

nleroy917 commented 2 months ago

Note: when using the vendored feature... we need the following:

The build process requires a C compiler, perl (and perl-core), and make

So I needed to add the perl toolchain to the installation scripts:

# Check for yum and use it if available, otherwise use apt-get
if command -v yum &> /dev/null; then
  echo "Detected yum package manager"
  yum -y install openssl-devel perl-IPC-Cmd perl-core # <--- new package added
elif command -v apt-get &> /dev/null; then
  echo "Detected apt-get package manager"
  apt-get update
  echo "Installing libssl-dev pkg-config openssl musl-dev"
  apt-get install -y libssl-dev pkg-config openssl musl-dev perl # <--- new package added
else
  echo "No supported package manager found (yum or apt-get)"
  exit 1
fi
nleroy917 commented 2 months ago

Ok this got me further, but now I have the following error:

 #error "ARM assembler must define __ARM_ARCH

Luckily someone doing the Rust-Python dance with pyo3 and maturin has hit this before it seems... https://github.com/messense/typst-py/commit/bafa4547f74374d05afc515f4318bcfb96c619d7

I found this commit by landing on this apache/opendal issue: https://github.com/apache/opendal/issues/3673

Suggestions is to set the following ENV variable in the builder container:

CFLAGS_aarch64_unknown_linux_gnu: "-D__ARM_ARCH=8"
nleroy917 commented 2 months ago

We get further! But alas, a new error:

cannot find "-latomic"

According to stack overflow:

The -l switch asks the linker to use a certain library. It should followed by the name of a library or a file system path to the library.

So I will try to install that as well in the pre-build script. After some googling it seems that for yum its libatomic and for apt-get its atomic1. New script is:

# Check for yum and use it if available, otherwise use apt-get
if command -v yum &> /dev/null; then
  echo "Detected yum package manager"
  yum -y install openssl-devel perl-IPC-Cmd perl-core libatomic # <--- new package added
elif command -v apt-get &> /dev/null; then
  echo "Detected apt-get package manager"
  apt-get update
  echo "Installing libssl-dev pkg-config openssl musl-dev"
  apt-get install -y libssl-dev pkg-config openssl musl-dev perl libatomic1 # <--- new package added
else
  echo "No supported package manager found (yum or apt-get)"
  exit 1
fi
nleroy917 commented 2 months ago

That didn't work... same error. It's possible I am not installing the write stuff. I'm going to try adding make to the installs and see what that does

nleroy917 commented 2 months ago

I know that it was discussed that linux on aarch64 is a nice thing to support, but I am wondering if temporarily we can just remove this from the build?

nleroy917 commented 1 month ago

Notes from the infrastructure meeting

Short term solution: Just don't compile to linux/aarch64 Medium term solution: Revisit the idea of pushing lots of logic to the python interface.

nleroy917 commented 1 month ago

I re-added the python layer in geniml and removed the ability to instantiate a Tokenizer from huggingface for now. This got rid of openssl and reqwest and now things build nicely.

nleroy917 commented 1 month ago

This is solved since we removed openssl dependency