facebookresearch / agenthive

AgentHive provides the primitives and helpers for a seamless usage of robohive within TorchRL.
30 stars 4 forks source link

torchRL opening too many files while sampling #10

Closed ShahRutav closed 1 year ago

ShahRutav commented 1 year ago
Traceback (most recent call last):
  File "sac.py", line 418, in main
    sampled_tensordict = replay_buffer.sample(args.batch_size).clone()
  File "/home/rutavms/research/robohive/latest/rl/torchrl/data/replay_buffers/replay_buffers.py", line 427, in sample
    data, info = super().sample(batch_size, return_info=True)
  File "/home/rutavms/research/robohive/latest/rl/torchrl/data/replay_buffers/replay_buffers.py", line 243, in sample
    ret = self._prefetch_queue.popleft().result()
  File "/home/rutavms/miniconda3/envs/rlhive/lib/python3.8/concurrent/futures/_base.py", line 437, in result
    return self.__get_result()
  File "/home/rutavms/miniconda3/envs/rlhive/lib/python3.8/concurrent/futures/_base.py", line 389, in __get_result
    raise self._exception
  File "/home/rutavms/miniconda3/envs/rlhive/lib/python3.8/concurrent/futures/thread.py", line 57, in run
    result = self.fn(*self.args, **self.kwargs)
  File "/home/rutavms/research/robohive/latest/rl/torchrl/data/replay_buffers/replay_buffers.py", line 63, in decorated_fun
    output = fun(self, *args, **kwargs)
  File "/home/rutavms/research/robohive/latest/rl/torchrl/data/replay_buffers/replay_buffers.py", line 218, in _sample
    data = self._collate_fn(data)
  File "/home/rutavms/research/robohive/latest/rl/torchrl/data/replay_buffers/storages.py", line 438, in _collate_contiguous
    return x.to_tensordict()
  File "/home/rutavms/miniconda3/envs/rlhive/lib/python3.8/site-packages/tensordict/tensordict.py", line 1190, in to_tensordict
    {
  File "/home/rutavms/miniconda3/envs/rlhive/lib/python3.8/site-packages/tensordict/tensordict.py", line 1193, in <dictcomp>
    else value.to_tensordict()
  File "/home/rutavms/miniconda3/envs/rlhive/lib/python3.8/site-packages/tensordict/tensordict.py", line 1190, in to_tensordict
    {
  File "/home/rutavms/miniconda3/envs/rlhive/lib/python3.8/site-packages/tensordict/tensordict.py", line 1193, in <dictcomp>
    else value.to_tensordict()
  File "/home/rutavms/miniconda3/envs/rlhive/lib/python3.8/site-packages/tensordict/tensordict.py", line 1190, in to_tensordict
    {
  File "/home/rutavms/miniconda3/envs/rlhive/lib/python3.8/site-packages/tensordict/tensordict.py", line 1191, in <dictcomp>
    key: value.clone()
  File "/home/rutavms/miniconda3/envs/rlhive/lib/python3.8/site-packages/tensordict/memmap.py", line 368, in clone
    return self._tensor.clone()
  File "/home/rutavms/miniconda3/envs/rlhive/lib/python3.8/site-packages/tensordict/memmap.py", line 348, in _tensor
    return self._load_item(self._index)
  File "/home/rutavms/miniconda3/envs/rlhive/lib/python3.8/site-packages/tensordict/memmap.py", line 302, in _load_item
    memmap_array = self.memmap_array
  File "/home/rutavms/miniconda3/envs/rlhive/lib/python3.8/site-packages/tensordict/memmap.py", line 240, in _get_memmap_array
    self._memmap_array = np.memmap(
  File "/home/rutavms/miniconda3/envs/rlhive/lib/python3.8/site-packages/numpy/core/memmap.py", line 267, in __new__
    mm = mmap.mmap(fid.fileno(), bytes, access=acc, offset=start)
OSError: [Errno 24] Too many open files

This can be avoided by restricting the env_info keys to be saved in the replay buffer.

ShahRutav commented 1 year ago

https://github.com/pytorch/rl/commit/9003a56bcec4da4ddc126551a7cc311af3ec9fc7