rail-berkeley / softlearning

Softlearning is a reinforcement learning framework for training maximum entropy policies in continuous domains. Includes the official implementation of the Soft Actor-Critic algorithm.
https://sites.google.com/view/sac-and-applications
Other
1.2k stars 238 forks source link

GLIBC version requirements: /lib64/libm.so.6: version `GLIBC_2.23' not found #95

Open weijiafeng opened 5 years ago

weijiafeng commented 5 years ago

Hi Kristian, @hartikainen

When I am creating the softlearning environment based on the latest masters repo, I encountered the problem saying I need to upgrade my machine's GLIBC from version 2.17 to 2.23:

image

Yet I am running this on a GPU cluster (operating on CentOS) and it is non-trivial to upgrade the lib C (e.g. upgrading may affect other users, need to write the script in Lua which we don't have much knowledge on). I tried installing the Softlearning environment using older version of the requirements.txt file which the Tensorflow doesn't require GLIBC 2.23, however there are multiple errors on inconsistencies between packages. Hence I want to pick your brain on whether there is got a way to successfully install the environment and run the reward-learning-rl codes without requiring GLIBC 2.23?

Another option I tried is using the docker on my GPU cluster, however when we are building the docker container on the cluster, it is downloading packages from ubuntu official website, which is blocked in China :-( We tried using a mirror website however my Mujoco license is hard-locked to a particular machine hence the codes could not run on a mirror server. Also wondering if you know a way to get around this problem?

By the way, there is a robosuite & mujoco-py version inconsistency error popping up when installing Envs from the laster masters version - not sure if this is an issue?

Thanks for looking into this Kristian :-)

image

Docker creation error: image

hartikainen commented 5 years ago

Yet I am running this on a GPU cluster (operating on CentOS) and it is non-trivial to upgrade the lib C (e.g. upgrading may affect other users, need to write the script in Lua which we don't have much knowledge on).

This is reasonable. This is the main reason why I have the dockerfile available in this repo, since working on shared machines can be a pain in terms of dependencies without docker.

whether there is got a way to successfully install the environment and run the reward-learning-rl codes without requiring GLIBC 2.23?

I'm not sure. It sounds like glibc is quite integral part of the installation and it might be difficult to get around that. One thing that comes to my mind is, maybe you could install glibc over conda? If that doesn't work, then I guess your best chance is to get the docker installation working.

Another option I tried is using the docker on my GPU cluster, however when we are building the docker container on the cluster, it is downloading packages from ubuntu official website, which is blocked in China :-(

Unfortunately, I don't know how to get around this.

We tried using a mirror website however my Mujoco license is hard-locked to a particular machine hence the codes could not run on a mirror server. Also wondering if you know a way to get around this problem?

Don't know how to fix this either. It really sucks that Mujoco is closed source and doesn't allow you to run things on docker. Unfortunately, I can't really do anything about it.

By the way, there is a robosuite & mujoco-py version inconsistency error popping up when installing Envs from the laster masters version - not sure if this is an issue?

Yeah, I'm aware of that. I was testing robosuite out and was hoping that they would upgrade to mujoco 2.0 (there's a PR open here: https://github.com/StanfordVL/robosuite/pull/27). It seems like things are working fine even though pip warns about the version mismatches. Maybe we should point the robosuite requirement to our custom branch with mujoco 2.0 until their master gets updated.

weijiafeng commented 5 years ago

Thanks for this detailed reply Kristian :-) I am trying to install the requirements as per your latest commit, except I downgraded Tensorflow-gpu to 1.12.0, mujoco-py<2.1,>=2.0, not from your git repo (cos it requires Tensorflow 1.14.0rc), and a few other packages altered the versions to be compatible with Tensorflow-gpu 1.12.0.

Now I am facing the local modules not found issue which I've posted on the other posts.

Thanks!

weijiafeng commented 5 years ago

Btw, as GLIBC requires pretty heavy compilation and requires admin rights on clusters, I don't think I can get it properly installed with Conda. Hence I think for now I will wait until CentOS 8 comes out, which officially supports GLIBC 2.23 and beyond.

FabianJimenezEsparza commented 4 years ago

Trata de hacer un upgrade de version...