facebookresearch / OccupancyAnticipation

This repository contains code for our publication "Occupancy Anticipation for Efficient Exploration and Navigation" in ECCV 2020.
MIT License
78 stars 26 forks source link

RuntimeError: inverse_cuda: For batch 0: U(163642880,163642880) is zero, singular U #43

Closed narekvslife closed 2 years ago

narekvslife commented 2 years ago

Hi everyone!

I am running a docker image on google.cloud instance with nvidia K80. Trying it with three different base images:

nvidia/cudagl:10.0-devel-ubuntu18.04 nvidia/cudagl:10.1-devel-ubuntu18.04 nvidia/cudagl:10.2-devel-ubuntu18.04

Dockerfile follows the steps from the README + additional steps for downloading the gibson dataset and additional configs.

nvidia-smi and nvcc work as expected, I am also setting CUDA_VISIBLE_DEVICES=0

Trying different combinations of torch/torch-scatter with python==3.7.11 After running:

/root/miniconda3/envs/env/bin/python -u run.py --exp-config $OCCANT_ROOT_DIR/configs/model_configs/occant_rgb/ppo_navigation_evaluate.yaml --run-type eval

I am either getting: 1) undefined symbol: _ZN6caffe26detail37_typeMetaDataInstance _preallocated_32E

or

2) RuntimeError: inverse_cuda: For batch 0: U(163642880,163642880) is zero, singular U. Exception ignored in: <function VectorEnv.del at 0x7f1fb3805560> Traceback (most recent call last): File "/OccupancyAnticipation/environments/habitat/habitat-api/habitat/core/vector_env.py", line 534, in del File "/OccupancyAnticipation/environments/habitat/habitat-api/habitat/core/vector_env.py", line 416, in close File "/root/miniconda3/envs/env/lib/python3.7/multiprocessing/connection.py", line 206, in send

I saw issues #34 and #3, so I understand that this might be an incompatibility issue between torch/torch-scatter/cuda Can someone please suggest a working combination of torch/torch-scatter versions for cuda10.2?

narekvslife commented 2 years ago

UPDATE

Random permutations of uninstalling and installing made everything work..

nvidia/cudagl:10.1-devel-ubuntu18.04 python==3.7.11 torch==1.4.0 torchvision==0.5.0 torch-scatter==1.3.1