tlc-pack / tlcpack

https://tlcpack.ai/
Apache License 2.0
23 stars 30 forks source link

`tlcpack/package-cpu` uses different GCC version than `tlcpack/package-cu*` leading to compatibility issues #142

Closed PhilippvK closed 1 year ago

PhilippvK commented 1 year ago

While fixing the Linux nightly builds I ran into another issue (in addition to #138 #141 which I was able to fix):

While all the CUDA build now complete successful, the cpu-only build is still broken due to the following error:

auditwheel: error: cannot repair "dist/tlcpack_nightly-0.10.0rc0.dev44+g1311cac88-cp37-cp37m-linux_x86_64.whl" to "manylinux2014_x86_64" ABI because of the presence of too-recent versioned symbols. You'll need to compile the wheel on an older toolchain.

I tracked down the issue to the used base image in the Dockerfile. While the CUDA builds are using pytorch/manylinux-cuda* the CPU image uses quay.io/pypa/manylinux2014_x86_64:2022-02-13-594988e.

Apart from the CUDA support these base images have a significant difference: The PyTorch variant uses GCC 9.3 while the newer official manylinux images ship with GCC 10.2 (since August 2021). With GCC9 the aforementioned error does not occur.

We have a few possibilities to fix this:

PhilippvK commented 1 year ago

@leandron @driazati What do you think?

driazati commented 1 year ago

the pytorch/manylinux-cpu switch sounds reasonable and nicely aligns with the other images. it seems like we could also install gcc-9 ourselves and use that to compile tvm so there are no symbol version issues but just changing the base image to match the others is a lot simpler

PhilippvK commented 1 year ago

@driazati I forgot that you ship an Aarch64 image as well. I am not sure what the status of that one is and I did not test it.

As Pytorch does not ship Aarch64 versions of their Docker images (see https://github.com/pytorch/pytorch/issues/59437) my proposed solution is not applicable for tlcpack/package-cpu_aarch64.