mees / calvin

CALVIN - A benchmark for Language-Conditioned Policy Learning for Long-Horizon Robot Manipulation Tasks
http://calvin.cs.uni-freiburg.de
MIT License
408 stars 58 forks source link

Illegal instruction when trying to evaluate pre-trained MCIL model with dataset D #70

Closed AtteNyyssonen closed 8 months ago

AtteNyyssonen commented 9 months ago

Hello,

I've been trying to get CALVIN evaluation to work using the dataset D and the pre-trained MCIL model. I am working on a virtual machine with Ubuntu 22.04.03.

I have followed the instructions given with setting up the conda env and downloads, but running the following command doesn't work

python ../calvin_models/calvin_agent/evaluation/evaluate_policy.py --dataset_path ./task_D_D/ --train_folder ./D_D_static_rgb_baseline/ --checkpoint ./D_D_static_rgb_baseline/mcil_baseline.ckpt

Output: Illegal instructions (core dumped)

Commands were tried from the directory calvin/dataset/ where I installed the pre-trained model and dataset D.

Did I miss something in the instructions? I can't figure out what to do next.

lukashermann commented 9 months ago

Hi @AtteNyyssonen , Is that the full output that was printed to the command line? (i.e., did it happen in the first line that was printed?) If not, could you provide us with the complete output log?

AtteNyyssonen commented 9 months ago

Hi @lukashermann,

Yes that is the only thing printed in CL, here is the complete output

(base) atte@atte-VirtualBox: ~ /calvin$ conda activate calvin_venv (calvin_venv) atte@atte-VirtualBox: ~ /calvin$ cd dataset/ (calvin_venv) atte@atte-VirtualBox:~/calvin/dataset$ python ../calvin_models/calvin_agent/evaluation/evaluate_policy.py --dataset_path ./task_D_D/ --train_folder ./D_D_static_rgb_baseline/ --checkpoint ./D_D_static_rgb_baseline/mcil_baseline.ckpt Illegal instruction (core dumped)

(Added spaces around the first two ~ because github interpreted those as omitting text )

AtteNyyssonen commented 9 months ago

Hi @lukashermann,

I figured out after extensive trying and googling that this was caused by Hyper-V still being active due to Windows 11 Memory Integrity VBS.

Now I've hit another problem with finding the correct EGL device. Output of running the evaluation command stated above:

Couldn't find correct EGL device. Setting EGL_VISIBLE_DEVICE=0. When using DDP with many GPUs this can lead to OOM errors. Did you install PyBullet correctly? Please refer to calvin env README argv[0]=--width=200 argv[1]=--height=200 EGL device choice: 0 of 2 (from EGL_VISIBLE_DEVICES) Loaded EGL 1.4 after reload. Unable to create EGL context (eglError: 12292)

I looked at some of the other issues and found the one where you asked for the output of this: cd calvin_env/egl_check bash build.sh # should have been built automatically, but try running this again python list_egl_options.py

Here is the output:

----------Default------------- Starting EGL query Loaded EGL 1.4 after reload. b'EGL device choice: -1 of 2.\nUnable to create EGL context (eglError: 12297)\n' number of EGL devices: 2 ----------Option #1 (id=0)------------- Starting EGL query EGL device choice: 0 of 2 (from EGL_VISIBLE_DEVICE) Loaded EGL 1.4 after reload. Unable to create EGL context (eglError: 12297)

----------Option #2 (id=1)------------- Starting EGL query EGL device choice: 1 of 2 (from EGL_VISIBLE_DEVICE) Loaded EGL 1.5 after reload. GL_VENDOR=Mesa GL_RENDERER=llvmpipe (LLVM 15.0.7, 256 bits) GL_VERSION=4.5 (Core Profile) Mesa 23.0.4-0ubuntu1~22.04.1 GL_SHADING_LANGUAGE_VERSION=4.50 Completeing EGL query

There was also a mention of older PyBullet versions being the issue, my calvin_venv currently has pybullet 3.2.6

What could be the cause of this EGL issue?

lukashermann commented 9 months ago

Hi @AtteNyyssonen, which GPU do you have? We have only tested the code on machines with Nvidia GPUs. In case you do have an Nvidia GPU, maybe you need to reinstall the drivers.

AtteNyyssonen commented 8 months ago

Yes, that was the issue. The VM can't access my Nvidia GPU and was using a virtualized one which caused this error. I will close this issue as the problems have been solved by switching to WSL.