Has anyone encountered the same problem when I made such an error after executing the training command?

Star-down commented 5 months ago

2024-04-23 10:00:57,809 agent number of parameters: 4346693 Traceback (most recent call last): File "/home/dwl/ss/sound-spaces/ss_baselines/av_nav/run.py", line 101, in main() File "/home/dwl/ss/sound-spaces/ss_baselines/av_nav/run.py", line 95, in main trainer.train() File "/home/dwl/ss/sound-spaces/ss_baselines/av_nav/ppo/ppotrainer.py", line 267, in train rollouts.observations[sensor][0].copy(batch[sensor]) RuntimeError: The size of tensor a (5) must match the size of tensor b (128) at non-singleton dimension 1 Exception ignored in: <function VectorEnv.del at 0x7a436e41e670> Traceback (most recent call last): File "/home/dwl/ss/habitat-lab/habitat/core/vector_env.py", line 592, in del self.close() File "/home/dwl/ss/habitat-lab/habitat/core/vector_env.py", line 463, in close write_fn((CLOSE_COMMAND, None)) File "/home/dwl/ss/habitat-lab/habitat/core/vector_env.py", line 118, in call self.write_fn(data) File "/home/dwl/ss/habitat-lab/habitat/utils/pickle5_multiprocessing.py", line 62, in send self.send_bytes(buf.getvalue()) File "/home/dwl/anaconda3/envs/ss/lib/python3.9/multiprocessing/connection.py", line 205, in send_bytes self._send_bytes(m[offset:offset + size]) File "/home/dwl/anaconda3/envs/ss/lib/python3.9/multiprocessing/connection.py", line 416, in _send_bytes self._send(header + buf) File "/home/dwl/anaconda3/envs/ss/lib/python3.9/multiprocessing/connection.py", line 373, in _send n = write(self._handle, buf) BrokenPipeError: [Errno 32] Broken pipe

swapb94 commented 3 months ago

Were you able to resolve this, @Star-down ?

Star-down commented 3 months ago

@swapb94 no,I tried to remove the redundant dimensions, but the program stopped due to a memory leak after running for about 20 minutes.

swapb94 commented 3 months ago

@ChanganVR , any suggestions?

swapb94 commented 3 months ago

@ChanganVR , any suggestions?

I followed the step-by-step installation guide, checked out both habitat-lab and habitat-sim to v0.1.7 However, when running cache_observations gives ImportError: cannot import name 'HabitatSimSensor' from 'habitat.sims.habitat_simulator.habitat_simulator, a similar issue is already open (https://github.com/facebookresearch/sound-spaces/issues/134).

In order to resolve this, I copied habitat-lab/habitat/sims/habitat_simulator/habitat_simulator.py from habitat-labv0.2.2 to habitat-labv0.1.7 After doing this cache_observations.py runs without any errors, however running python ss_baselines/av_nav/run.py --run-type eval --exp-config ss_baselines/av_nav/config/audionav/replica/test_telephone/audiogoal_depth.yaml EVAL_CKPT_PATH_DIR data/pretrained_weights/audionav/av_nav/replica/heard.pth gives

2024-06-11 10:43:26,164 Initializing dataset AudioNav 2024-06-11 10:43:26,182 initializing sim SoundSpacesSim 2024-06-11 10:43:26,532 Initializing task AudioNav Sequential( (0): Conv2d(1, 32, kernel_size=(8, 8), stride=(4, 4)) (1): ReLU(inplace=True) (2): Conv2d(32, 64, kernel_size=(4, 4), stride=(2, 2)) (3): ReLU(inplace=True) (4): Conv2d(64, 64, kernel_size=(3, 3), stride=(2, 2)) (5): Flatten() (6): Linear(in_features=2304, out_features=512, bias=True) (7): ReLU(inplace=True) )
    Layer (type)               Output Shape         Param #
================================================================ Conv2d-1 [-1, 32, 15, 16] 4,128 ReLU-2 [-1, 32, 15, 16] 0 Conv2d-3 [-1, 64, 6, 7] 32,832 ReLU-4 [-1, 64, 6, 7] 0 Conv2d-5 [-1, 64, 4, 5] 36,928 Flatten-6 [-1, 1280] 0 Linear-7 [-1, 512] 655,872 ReLU-8 [-1, 512] 0

Total params: 729,760 Trainable params: 729,760 Non-trainable params: 0

Input size (MB): 0.03 Forward/backward pass size (MB): 0.19 Params size (MB): 2.78 Estimated Total Size (MB): 3.00

0%| | 0/1000 [00:00<?, ?it/s]Traceback (most recent call last): File "/media/scratch2/projects2/soundspaces/sound-spaces/ss_baselines/av_nav/run.py", line 101, in main() File "/media/scratch2/projects2/soundspaces/sound-spaces/ss_baselines/av_nav/run.py", line 97, in main trainer.eval(args.eval_interval, args.prev_ckpt_ind, config.USE_LAST_CKPT) File "/media/scratch2/projects2/soundspaces/sound-spaces/ss_baselines/common/base_trainer.py", line 105, in eval result = self._eval_checkpoint(self.config.EVAL_CKPT_PATH_DIR, writer) File "/media/scratch2/projects2/soundspaces/sound-spaces/ss_baselines/av_nav/ppo/ppo_trainer.py", line 522, in _evalcheckpoint , actions, _, test_recurrent_hidden_states = self.actor_critic.act( File "/media/scratch2/projects2/soundspaces/sound-spaces/ss_baselines/av_nav/ppo/policy.py", line 44, in act features, rnn_hidden_states = self.net( File "/home/swapnil/anaconda3/envs/ss/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1532, in _wrapped_call_impl return self._call_impl(*args, kwargs) File "/home/swapnil/anaconda3/envs/ss/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1541, in _call_impl return forward_call(*args, *kwargs) File "/media/scratch2/projects2/soundspaces/sound-spaces/ss_baselines/av_nav/ppo/policy.py", line 206, in forward x.append(self.visual_encoder(observations)) File "/home/swapnil/anaconda3/envs/ss/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1532, in _wrapped_call_impl return self._call_impl(args, kwargs) File "/home/swapnil/anaconda3/envs/ss/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1541, in _call_impl return forward_call(*args, **kwargs) File "/media/scratch2/projects2/soundspaces/sound-spaces/ss_baselines/av_nav/models/visual_cnn.py", line 156, in forward depth_observations = depth_observations.permute(0, 3, 1, 2) RuntimeError: permute(sparse_coo): number of dimensions in the tensor input does not match the length of the desired ordering of dimensions i.e. input.dim() = 5 is not equal to len(dims) = 4 0%| | 0/1000 [00:00<?, ?it/s] Exception ignored in: <function VectorEnv.del at 0x7f57e21de790> Traceback (most recent call last): File "/media/scratch2/projects2/soundspaces/habitat-lab/habitat/core/vector_env.py", line 588, in del self.close() File "/media/scratch2/projects2/soundspaces/habitat-lab/habitat/core/vector_env.py", line 459, in close write_fn((CLOSE_COMMAND, None)) File "/media/scratch2/projects2/soundspaces/habitat-lab/habitat/core/vector_env.py", line 118, in call self.write_fn(data) File "/media/scratch2/projects2/soundspaces/habitat-lab/habitat/utils/pickle5_multiprocessing.py", line 63, in send self.send_bytes(buf.getvalue()) File "/home/swapnil/anaconda3/envs/ss/lib/python3.9/multiprocessing/connection.py", line 200, in send_bytes self._send_bytes(m[offset:offset + size]) File "/home/swapnil/anaconda3/envs/ss/lib/python3.9/multiprocessing/connection.py", line 411, in _send_bytes self._send(header + buf) File "/home/swapnil/anaconda3/envs/ss/lib/python3.9/multiprocessing/connection.py", line 368, in _send n = write(self._handle, buf) BrokenPipeError: [Errno 32] Broken pipe

facebookresearch / sound-spaces

Has anyone encountered the same problem when I made such an error after executing the training command? #141

================================================================ Conv2d-1 [-1, 32, 15, 16] 4,128 ReLU-2 [-1, 32, 15, 16] 0 Conv2d-3 [-1, 64, 6, 7] 32,832 ReLU-4 [-1, 64, 6, 7] 0 Conv2d-5 [-1, 64, 4, 5] 36,928 Flatten-6 [-1, 1280] 0 Linear-7 [-1, 512] 655,872 ReLU-8 [-1, 512] 0

Total params: 729,760 Trainable params: 729,760 Non-trainable params: 0

Input size (MB): 0.03 Forward/backward pass size (MB): 0.19 Params size (MB): 2.78 Estimated Total Size (MB): 3.00