sfu-db / connector-x

Fastest library to load data from DB to DataFrames in Rust and Python
https://sfu-db.github.io/connector-x
MIT License
2.02k stars 163 forks source link

Add linux aarch64 support #386

Open leodiegues opened 2 years ago

leodiegues commented 2 years ago

Hi!

I've been really struggling to install the package in an EC2 instance that happens to have an aarch64 processor and is running ubuntu. Are there any plans to add aarch64 support for linux machines in the future?

Regards

wangxiaoying commented 2 years ago

Hi @leonardodiegues , currently we use github action and github hosted runners to manage our release process. So we cannot support aarch64 linux since it is not in the list here. For linux aarch64, you may need to build your own wheel file like mentioned in this doc

leodiegues commented 2 years ago

Thank you for your answer. I'll follow the docs build instructions!

Sirsirious commented 1 year ago

The build instructions do not work out of the box, especially as it points to ever changing Rust nightly versions.

Lacking official Linux aarch64 support hinders the adoption of this library, as it is basically the version that every developer using a mac M1 and docker will end up needing for local testing. Could we have better instructions/support for that?

wangxiaoying commented 1 year ago

Hi @Sirsirious , thanks for the comment. For mac M1 we do have precompiled wheel file support. You can just run pip install connectorx to install it. So only linux users with aarch64 processor need to build connectorx from the source. We also tried to cross compile it in our release action (like what we did for mac m1) but still has some issues (#240). A good news is that we have switched to rust stable since the GAT is stabilized !

duvenagep commented 1 year ago

Hi @wangxiaoying,

I am a big fan of this library, use it a lot for EDA in Jupyter Notebooks with Polars with big datasets >= 200GB. I want to now use it in a production env on Airflow! Unfortunately my whole team uses mac M1 and every team member builds a local a local dev airflow environment from a docker-compose file on a shared github repo!

I currently use polars but read in the data using pd.read_sql_query and then transform the data to a polars dataframe with pl.from_pandas! I would prefer to use connectorX with Polars!

Below is the error log trying to install ConnectorX. Currently building from source for the whole team is not an option.

Is there any update on the support for aarch64 support?

 Downloading polars-0.16.5-cp37-abi3-manylinux_2_17_aarch64.manylinux2014_aarch64.whl (13.3 MB)
#0 11.85      ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 13.3/13.3 MB 14.2 MB/s eta 0:00:00
#0 11.89 ERROR: Ignored the following versions that require a different python version: 1.0.0 Requires-Python >=3.6, <3.9; 1.0.1 Requires-Python >=3.6, <3.9; 1.0.2 Requires-Python >=3.6, <3.9; 1.0.3 Requires-Python >=3.6, <3.9; 1.0.4 Requires-Python >=3.6, <3.9; 1.1.0 Requires-Python >=3.6, <3.9; 1.1.1 Requires-Python >=3.6, <3.9; 1.1.2 Requires-Python >=3.6, <3.9; 1.10.0 Requires-Python >=3.6, <3.9; 1.10.1 Requires-Python >=3.6, <3.9; 1.2.0 Requires-Python >=3.6, <3.9; 1.3.0 Requires-Python >=3.6, <3.9; 1.4.0 Requires-Python >=3.6, <3.9; 1.5.0 Requires-Python >=3.6, <3.9; 1.6.0 Requires-Python >=3.6, <3.9; 1.6.1 Requires-Python >=3.6, <3.9; 1.6.2 Requires-Python >=3.6, <3.9; 1.6.3 Requires-Python >=3.6, <3.9; 1.7.0 Requires-Python >=3.6, <3.9; 1.8.0 Requires-Python >=3.6, <3.9; 1.8.1 Requires-Python >=3.6, <3.9; 1.9.0 Requires-Python >=3.6, <3.9; 1.9.1 Requires-Python >=3.6, <3.9; 1.9.2 Requires-Python >=3.6, <3.9; 1.9.3 Requires-Python >=3.6, <3.9; 1.9.4 Requires-Python >=3.6, <3.9; 1.9.5 Requires-Python >=3.6, <3.9; 1.9.6 Requires-Python >=3.6, <3.9; 2.0.0 Requires-Python >=3.6, <3.9; 2.0.1 Requires-Python >=3.6, <3.9; 2.1.0 Requires-Python >=3.6, <3.9; 2.2.0 Requires-Python >=3.6, <3.9; 2.3.0 Requires-Python >=3.6, <3.9
#0 11.89 ERROR: Could not find a version that satisfies the requirement connectorx==0.3.1 (from versions: 0.2.3)
#0 11.89 ERROR: No matching distribution found for connectorx==0.3.1
#0 11.89 
#0 11.89 [notice] A new release of pip available: 22.3.1 -> 23.0
#0 11.89 [notice] To update, run: python -m pip install --upgrade pip
------
failed to solve: executor failed running [/bin/bash -o pipefail -o errexit -o nounset -o nolog -c pip install -r ./requirements.txt]: exit code: 1
ghost commented 1 year ago

I am also interested if this is on the roadmap, linux ARM64 support is very important for some of the work we are doing and it would be very nice if this feature was officially supported.

devinrsmith commented 1 year ago

For mac M1 we do have precompiled wheel file support. You can just run pip install connectorx to install it.

This works great for local mac M1 development purposes, but when using Docker on M1 it does not work well since Docker containers on M1 are (ideally) Linux aarch64/arm64. (You can have Docker on M1 emulate x86_64 / amd64, but it's painful...)

dat-adi commented 1 year ago

Hi, my team has been attempting to retrieve a connector-x python wheel on aarch64 for an EC2 instance, and we've succeeded in building a v0.3.1 for the same by following the build instructions. I'll be pasting the Dockerfile used to retrieve the wheel file here so that it reduces the effort of having to set everything up from scratch.

# Retrieving the base image from manylinux 2014 because that's what connectorx
# 0.3.1 was built on.
FROM quay.io/pypa/manylinux2014_aarch64

# Installing devel dependencies
RUN yum install -y epel-release
RUN yum install -y mysql-devel freetds-devel postgresql-devel

# Creating and changing to a new directory
RUN mkdir /wheeler
WORKDIR /wheeler

# Installing and setting up rust
RUN curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh -s -- -y
ENV PATH="$PATH:/root/.cargo/bin"

# Installing just through cargo
#RUN "$HOME/.cargo/bin/cargo" install just
RUN cargo install just

# Installing python3.9.6 from source
RUN yum install -y wget
RUN wget https://www.python.org/ftp/python/3.9.6/Python-3.9.6.tgz
RUN tar -xvf Python-3.9.6.tgz
RUN cd Python-3.9.6 && ./configure --enable-optimizations
RUN cd Python-3.9.6 && make install
RUN pip3.9 install poetry

# Cloning the connectorx repo and switching to the 0.3.1 tag
RUN git clone https://github.com/sfu-db/connector-x.git
WORKDIR /wheeler/connector-x
RUN git checkout tags/v0.3.1

# Installing the nightly version of rust upon which connectorx was built
RUN cargo install nightly-2022-09-15
RUN cargo override set nightly-2022-09-15

# Installing maturin
RUN pip3.9 install maturin==0.12.1

# Building the python wheel through maturin
RUN maturin build -m connectorx-python/Cargo.toml -i python3.9 --no-sdist --release --manylinux 2014

# Copying the wheel into the host system
COPY /wheeler/connector-x/connectorx-python/target/wheels/connectorx-0.3.1-*.whl .

# The wheel should be on your system at this point.

Hope this helps someone out with the build. Please let me know if I can help out with sending a PR or something :D

jordibeen commented 1 year ago

Lovely stuff @dat-adi, thanks for sharing!

Note that if you're looking for "pip install connectorx" in an arm-based Dockerfile, you can use adi's Dockerfile as a build stage in a multi-stage build to accomplish this.

When naming the stage connectorx-builder as such:

FROM quay.io/pypa/manylinux2014_aarch64 as connectorx-builder

You will be able to copy and pip install connectorx in another stage of your Dockerfile:

COPY --from=connectorx-builder /wheeler/connector-x/connectorx-python/target/wheels/connectorx-0.3.1-*.whl ./
RUN pip install connectorx-0.3.1-*.whl

(Small side note: I had to install and configure the correct nightly version of rust the following way, as opposed to the cargo commands mentioned above)

RUN rustup install nightly-2022-09-15
RUN rustup override set nightly-2022-09-15
tomroh commented 1 year ago

When referring to these docs here, I'm trying to find the right rust version as stated to "search for rust". Everything above references night builds but is it now just the "stable" version of rust?

duvenagep commented 1 year ago

Yes, just recently built from source using the above method described. Only now you don't need to specify the nightly anymore! Stable works.

So you can remove these lines:

RUN rustup install nightly-2022-09-15
RUN rustup override set nightly-2022-09-15
vnijs commented 11 months ago

@duvenagep and @dat-adi Thanks for putting out this information. I tried building wheels using the docker specks you provided but I'm running into the issue below. Any suggestions? Do you perhaps have an updated script for version 0.3.2?

> [18/19] RUN maturin build -m connectorx-python/Cargo.toml -i python3.9 --no-sdist --release --manylinux 2014:
365.8 
366.5 error: aborting due to previous error
366.5 
366.5 
366.5 For more information about this error, try `rustc --explain E0554`.
366.5 
366.5 error: could not compile `connectorx-python` (lib) due to 2 previous errors
366.5 💥 maturin failed
366.5   Caused by: Failed to build a native library through cargo
366.5   Caused by: Cargo build finished with "exit status: 101": `cargo rustc --message-format json --manifest-path connectorx-python/Cargo.toml --release --lib --`
------
Dockerfile:42
--------------------
  40 |     
  41 |     # Building the python wheel through maturin
  42 | >>> RUN maturin build -m connectorx-python/Cargo.toml -i python3.9 --no-sdist --release --manylinux 2014
  43 |     
  44 |     # Copying the wheel into the host system
--------------------
ERROR: failed to solve: process "/bin/sh -c maturin build -m connectorx-python/Cargo.toml -i python3.9 --no-sdist --release --manylinux 2014" did not complete successfully: exit code: 1
dat-adi commented 11 months ago

Hi @vnijs, When I tried to compile the binary a few weeks ago, it was working fine. Could you check to see whether adding the nightly flags fixes the issue?

It seemed to be necessary on my end, and did not work as expected without it. Also, check whether you can retrieve a logfile or a trace.

vnijs commented 11 months ago

When I add nightly stuff I get.

#21 [17/21] RUN cargo install nightly-2022-09-15
#21 0.073     Updating crates.io index
#21 0.305 error: could not find `nightly-2022-09-15` in registry `crates-io` with version `*`
#21 ERROR: process "/bin/sh -c cargo install nightly-2022-09-15" did not complete successfully: exit code: 101
------
 > [17/21] RUN cargo install nightly-2022-09-15:
0.073     Updating crates.io index
0.305 error: could not find `nightly-2022-09-15` in registry `crates-io` with version `*`
------

So fully the script you initially provided

# Installing the nightly version of rust upon which connectorx was built
RUN cargo install nightly-2022-09-15
RUN cargo override set nightly-2022-09-15

I don't use rust myself, just python, so I'm not sure how to get additional log/trace information here

vnijs commented 11 months ago

Update: Replace the cargo commands with rustup gets me a bit further (see below what I added and what I commented out).

Unfortunately I'm not sure where the wheel is now because I get the below. Any suggestions, please let me know. Thanks

#25 [21/21] COPY /wheeler/connector-x/connectorx-python/target/wheels/connectorx-0.3.1-*.whl .
#25 ERROR: lstat /tmp/buildkit-mount562135541/wheeler/connector-x/connectorx-python/target/wheels: no such file or directory
------
 > [21/21] COPY /wheeler/connector-x/connectorx-python/target/wheels/connectorx-0.3.1-*.whl .:
------
Dockerfile:48
--------------------
  46 |     
  47 |     # Copying the wheel into the host system
  48 | >>> COPY /wheeler/connector-x/connectorx-python/target/wheels/connectorx-0.3.1-*.whl .
  49 |     
  50 |     # The wheel should be on your system at this point.
--------------------
ERROR: failed to solve: lstat /tmp/buildkit-mount562135541/wheeler/connector-x/connectorx-python/target/wheels: no such file or directory
-----------------------------------------------------------------------
Docker build for connectorx was not successful
-----------------------------------------------------------------------
usage: sleep seconds
RUN rustup install nightly-2022-09-15
RUN rustup default nightly-2022-09-15
# RUN cargo install nightly-2022-09-15
# RUN cargo override set nightly-2022-09-15
vnijs commented 11 months ago

I was able to extract the wheel but when I tried to install it I got the below. I had hoped this would work with Ubuntu 22.04 (aarch64). Any help on next steps would be much appreciated.

❯ pip install connectorx-0.3.1-cp39-cp39-manylinux_2_17_aarch64.manylinux2014_aarch64.whl
ERROR: connectorx-0.3.1-cp39-cp39-manylinux_2_17_aarch64.manylinux2014_aarch64.whl is not a supported wheel on this platform.

Below the platform I was hoping to install this wheel into:

❯ cat /etc/os-release
PRETTY_NAME="Ubuntu 22.04.3 LTS"
NAME="Ubuntu"
VERSION_ID="22.04"
VERSION="22.04.3 LTS (Jammy Jellyfish)"
VERSION_CODENAME=jammy
ID=ubuntu
ID_LIKE=debian
HOME_URL="https://www.ubuntu.com/"
SUPPORT_URL="https://help.ubuntu.com/"
BUG_REPORT_URL="https://bugs.launchpad.net/ubuntu/"
PRIVACY_POLICY_URL="https://www.ubuntu.com/legal/terms-and-policies/privacy-policy"
UBUNTU_CODENAME=jammy
❯ uname -m
aarch64
vnijs commented 11 months ago

Got this working with connectorx 0.3.2, ubuntu 22.04, and python 3.11.7. See the code for the docker image below.

FROM arm64v8/ubuntu:22.04

# Installing devel dependencies
RUN apt-get update
RUN apt-get install -y \
    libmysqlclient-dev \
    freetds-dev \
    libpq-dev \
    wget \
    curl \
    build-essential \
    libkrb5-dev \
    clang \
    git

# Creating and changing to a new directory
RUN mkdir /wheeler
WORKDIR /wheeler

# Installing and setting up rust
RUN curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh -s -- -y
ENV PATH="$PATH:/root/.cargo/bin"

# Installing just through cargo
RUN cargo install just

# Installing python3.11.7 from source
RUN wget https://www.python.org/ftp/python/3.11.7/Python-3.11.7.tgz
RUN tar -xvf Python-3.11.7.tgz
RUN cd Python-3.11.7 && ./configure --enable-optimizations
RUN cd Python-3.11.7 && make install
RUN pip3.11 install poetry

# Cloning the connectorx repo and switching to the 0.3.1 tag
RUN git clone https://github.com/sfu-db/connector-x.git
WORKDIR /wheeler/connector-x
RUN git checkout tags/v0.3.2

# Installing maturin
RUN pip3.11 install maturin[patchelf]==0.14.15

# Building the python wheel through maturin
RUN maturin build -m connectorx-python/Cargo.toml -i python3.11 --release

# Copying the wheel into the host system
# the below didn't work for me
# COPY /wheeler/connector-x/connectorx-python/target/wheels/connectorx-* .

# use the below to access the wheel in /wheeler/connector-x/connectorx-python/target/wheels/ 
# docker run -it -v ./:/root your_user_name/connectorx /bin/bash

# then navigate to the directory below and copy the wheel to the home directory
# which is mounted to your current directory on your host OS
# /wheeler/connector-x/connectorx-python/target/wheel
jessy1092 commented 2 weeks ago

Thanks to everyone’s help, I successfully built connectorx v0.3.3 from source on the python:3.10-slim image and M1 MacOS. Here, I’d like to share the Dockerfile content below.

FROM python:3.10-slim AS builder

RUN apt-get update
RUN apt-get install -y curl

RUN mkdir /wheeler
WORKDIR /wheeler

RUN curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh -s -- -y
ENV PATH="$PATH:/root/.cargo/bin"

RUN rustup install 1.78.0
RUN rustup override set 1.78.0

RUN apt-get install -y git

RUN git clone https://github.com/sfu-db/connector-x.git
WORKDIR /wheeler/connector-x
RUN git checkout tags/v0.3.3

RUN pip install maturin[patchelf]==0.14.15

# Install the dependencies
RUN apt-get install -y clang build-essential libkrb5-dev

RUN maturin build -m connectorx-python/Cargo.toml -i python3.10 --release

FROM builder AS base

COPY --from=builder /wheeler/connector-x/connectorx-python/target/wheels/connectorx-0.3.3-*.whl ./
RUN pip install connectorx-0.3.3-*.whl