jax-ml / jax

Composable transformations of Python+NumPy programs: differentiate, vectorize, JIT to GPU/TPU, and more
http://jax.readthedocs.io/
Apache License 2.0
29.98k stars 2.75k forks source link

GLIBC issue #854

Closed zqwei closed 5 years ago

zqwei commented 5 years ago

I am running centos 7 and its default GLibc is 2.17. I wonder if there is any possible to rebuild a pip version based on that. The error is as follow when importing jax in ipython.

ImportError: /lib64/libm.so.6: version `GLIBC_2.23' not found (required by lib/python3.7/site-packages/jaxlib/xla_extension.so)
hawkinsp commented 5 years ago

Thanks for filing the issue!

I'm afraid I don't think we can build our pip packages for such an old distribution. CentOS 7 was released in 2014 (5 years ago!)

Since JAX depends on XLA, which is distributed as part of TensorFlow, we're (informally) following the same OS support policy as TensorFlow itself: https://www.tensorflow.org/install#install-tensorflow

In particular, we build our PIP wheels on Ubuntu 16.04 and don't really have the engineering time to support other distributions ourselves, especially ones as old as that.

That said, I know of no particular reason why it should not work, and you should most likely be able to build jaxlib yourself for CentOS 7; follow the instructions here: https://github.com/google/jax#building-jax-from-source . If you encounter any problems, we welcome PRs!

Hope that helps!

ericmjl commented 4 years ago

@hawkinsp sorry to rehash a closed issue, but I tried this thing where I installed linuxbrew in my home directory to get an updated glibc. (I don't have sudo permissions, so this is the only hack I know of to get an updated glibc and other CLI tools.)

I know there's an updated glibc by two mechanisms: (1) CLI tools that previously required GLIBC 2.23 and errored out now no longer error out, and (2) typing ldd --version gives me ldd 2.23.

However, importing JAX still gives me same /lib64/libm.so.6 issue. Is it because there's some path that is hard-coded?

jpellman commented 4 years ago

I'm afraid I don't think we can build our pip packages for such an old distribution. CentOS 7 was released in 2014 (5 years ago!)

This claim is only sort of true. While CentOS 7 was initially released in 2014, it has been continuously updated since then with minor releases, with the last major release in 9/2019 and the full update EOL in 8/2020. While it's understandable that you don't want to deal with the glacial pace of developer toolchain updates in the RHEL world (a deliberate decision made for stability, much to the chagrin of every developer everywhere) there are still some of us out there that operate in such environments. As you indicated in your last comment, this is an open source package and as such anyone is free to compile it themselves. I don't expect y'all Google to suddenly start supporting RHEL family distros if they don't want to. However, I think that there is a very real need/desire amongst the JAX user base for RHEL family support that should be acknowledged and not hand-waived aside with claims of "but it's old".

Disclaimer: I'm intending this comment as constructive criticism, and I hope it's received as such rather than as an attack, because overall I think y'all have done good work on Jax and deserve kudos for maintaining it as an open source project.

hawkinsp commented 4 years ago

I think this issue should now be resolved. JAX builds manylinux2010 compilant CPU wheels, which should be compatible with CentOS 6.

(The way we do this is actually by using a more modern toolchain, necessary for building the C++ parts of JAX and cross-compiling for a CentOS 6 era library chain.)

hawkinsp commented 4 years ago

That said, we don't actually test them on CentOS, so let us know if they work!

hawkinsp commented 4 years ago

I should also add: from the point of view of JAX, our support levels are at least somewhat based on what TensorFlow supports. JAX depends heavily on XLA, which is distributed as part of TensorFlow. The JAX team is quite small and we don't have resources ourselves to target things older than TensorFlow's code base can target. We're not opposed to it, but we only have finite time and many things we would like to make progress on (e.g., making an amazing machine learning toolkit!) Contributions are always welcome.

mattjj commented 4 years ago

To frame Peter's last comment another way: if you get TF / XLA to support building on a platform, we'll likely inherit that for free. The best place to push on that might be the TF issue tracker.

MilesCranmer commented 4 years ago

Also seeing this issue on my institute cluster (not a big deal, I'll try building from source) so +1 for CentOS 7. I can get TF/XLA to work from pip fine with GPUs.

I think many academic supercomputers use RHEL variants (e.g., DOE's Summit & Sierra) due to their stability so it might be something to consider adding support for. As @jpellman mentions, CentOS 8 was only released ~five months ago and I don't think it's well adopted yet. CentOS 7 is supported with maintenance through 2024.

MilesCranmer commented 4 years ago

I couldn't manage to get it to build, but the conda-forge version seems to work for CPU-only stuff.

RylanSchaeffer commented 3 years ago

Is there any update on getting JAX to work with CentOS? I work on a shared machine and can't change the OS

MilesCranmer commented 3 years ago

@RylanSchaeffer try this: https://github.com/google/jax/issues/2083#issuecomment-578578765