facebookresearch / minihack

MiniHack the Planet: A Sandbox for Open-Ended Reinforcement Learning Research
Apache License 2.0
484 stars 59 forks source link

[BUG] FileNotFoundError: [Errno 2] No such file or directory: '/tmp/nlew7mo9hzl' #88

Open skezle opened 1 year ago

skezle commented 1 year ago

🐛 Bug

We get a file not found error when looking for the NetHack temporary directory after running our RL algorithm for > 1M steps. This error doesn't occur on all runs, only some runs get this error. We use the following environment wrapper over MiniHack: https://github.com/AGI-Labs/continual_rl/blob/develop/continual_rl/experiments/tasks/make_minihack_task.py.

To Reproduce

Steps to reproduce the behavior:

We run our RL algorithm on any Minihack environments (RoomRandom-15x15 or RoomTrap-15x15). Occasionally after around 1M steps we obtain the following FileNotFoundError when looking for the NetHack tmp directory:

Screen Shot 2023-06-04 at 11 14 29 PM

Expected behavior

No FileNotFoundError when running our RL algorithm on MiniHack.

Environment

I'm running this on a SLURM scheduler so didn't collect the GPU info.

Collecting environment information... MiniHack version: 0.1.3 NLE version: 0.8.1 Gym version: 0.23.1 PyTorch version: N/A Is debug build: N/A CUDA used to build PyTorch: N/A

OS: CentOS Linux release 8.1.1911 (Core) GCC version: (conda-forge gcc 10.3.0-16) 10.3.0 CMake version: version 3.23.2

Python version: 3.8 Is CUDA available: N/A CUDA runtime version: Could not collect GPU models and configuration: Could not collect Nvidia driver version: Could not collect cuDNN version: Could not collect

Versions of relevant libraries: [pip3] numpy==1.19.5 [conda] Could not collect

Additional context

-

JupiLogy commented 1 year ago

This looks like an NLE problem as the directory creation goes back to there. That, and/or a problem with your computer deleting temp files automatically after a certain amount of time?

Bpoole908 commented 4 months ago

Was there any solution to this?