Closed corcra closed 8 years ago
There is no workaround to needing a newer version of glibc except docker. We cannot upgrade the system glibc. If they truly do not support the version that comes with 6.X you can consider a docker attempt. If you need to be added to the docker group please advise.
It seems like this glibc dependency is unavoidable, so I will have to go with docker. Can you add me to the docker group? Thanks!
You are added to the group on the head node, logout/login. Will take a little bit to propagate to the nodes.
Be very aware of the migration of docker capable kernels in progress and the need to if you get to a qsub situation to be sure to request the docker
attribute until that work is done. Long excessive story in #360 but basically I am working on making all nodes docker capable again since the repo change and only ones with that attribute are complete. If this doesn't make sense happy to elaborate.
A brief example perhaps of value in making sure you get a docker capable node:
hal> qsub -I -l nodes=1:docker -q active
gpu-1-6$ docker run ubuntu lsb_release -a
No LSB modules are available.
Distributor ID: Ubuntu
Description: Ubuntu 14.04.3 LTS
Release: 14.04
Codename: trusty
gpu-1-6$ exit
Obviously more to it than that but wanted to show the attribute selection.
Sounds good! I'll take care with qsub.
+1 on being able to support tensorflow natively. Would be good to figure out what the CentOS upgrade plan is from @juanperin at some point.
+1 for eventually leaving CentOS 6, though I'll leave it to @juanperin and @jchodera and @tatarsky to figure out the when.
Any success with using Docker to run TensorFlow?
There are some stackoverflow threads, namely the one that @corcra linked to, that do suggest hack arounds. My understanding is it involves installing a local glibc of the correct version by extracting from the Ubuntu package. Seems terrible though.
Am I correct that there are no plans to upgrade our glibc version? AFAIU we are running version 2.12 ($ ldd --version
), but the latest is 2.22 leaving us at least 5 years out of date.
There are no plans to my knowledge to update the base distro (and thus glibc) at this time.
You are welcome to speak to @juanperin on the topic but there is no clean or supported method to update the base glibc without the distro.
Note that CentOS 7 glibc is 2.17.
Docker and its ilk are glorified chroot environments to hack around the matter.
Yes, I guess that is what Docker and others are for. Hmm. Interesting.
Yes, I've gotten something to run by simply following the TensorFlow instructions. They have a handy Docker build section: https://www.tensorflow.org/versions/r0.7/get_started/os_setup.html#docker-installation.
Have not tested how this works with GPUs.
Thanks for the help.
There are no plans to my knowledge to update the base distro (and thus glibc) at this time.
There isn't a plan to do this during the three days of cluster downtime?
I'll ask @juanperin about what his plans are here.
+1 CentOs Update We talked about this for months and should get started ASAP. I can volunteer the cpath nodes. In practice, docker is a significant overhead and a security risk. CNTK for example, like TensorFlow, demands a newer glibc. Best, Thomas
I believe @juanperin intends to do this as part of the move of new cluster nodes inside the firewall, but I'm not sure a timeframe has been decided for this yet.
I'm trying to install TensorFlow (https://www.tensorflow.org/versions/0.6.0/get_started/os_setup.html) and it apparently requires glibc >2.17 (https://github.com/tensorflow/tensorflow/issues/53#issuecomment-156575907). Is it possible to get a newer version of glibc (I'm seeing version 2.12), or is there some workaround?