ojedaf / vCLIMB_Benchmark

28 stars 3 forks source link

The Question about the Number of GPUs. #8

Closed YeungChiu closed 9 months ago

YeungChiu commented 9 months ago

Hello. I now found a strange error, as shown below,

Traceback (most recent call last):
  File "main_icarl_conLoss.py", line 269, in <module>
    main()
  File "main_icarl_conLoss.py", line 174, in main
    train_loop(model, optimizer, train_cilDatasetList, val_cilDatasetList, test_cilDatasetList)
  File "main_icarl_conLoss.py", line 219, in train_loop
    model.add_samples_to_mem(val_cilDatasetList, data, m)
  File "/root/data/vCLIMB_Benchmark/model/iCaRL_conLoss.py", line 127, in add_samples_to_mem
    feature = self.feature_encoder(video, get_emb = True).data.cpu().numpy()
  File "/root/miniconda3/envs/vclimb/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1130, in _call_impl
    return forward_call(*input, **kwargs)
  File "/root/miniconda3/envs/vclimb/lib/python3.8/site-packages/torch/nn/parallel/data_parallel.py", line 168, in forward
    outputs = self.parallel_apply(replicas, inputs, kwargs)
  File "/root/miniconda3/envs/vclimb/lib/python3.8/site-packages/torch/nn/parallel/data_parallel.py", line 178, in parallel_apply
    return parallel_apply(replicas, inputs, kwargs, self.device_ids[:len(replicas)])
  File "/root/miniconda3/envs/vclimb/lib/python3.8/site-packages/torch/nn/parallel/parallel_apply.py", line 86, in parallel_apply
    output.reraise()
  File "/root/miniconda3/envs/vclimb/lib/python3.8/site-packages/torch/_utils.py", line 461, in reraise
    raise exception
TypeError: Caught TypeError in replica 1 on device 1.
Original Traceback (most recent call last):
  File "/root/miniconda3/envs/vclimb/lib/python3.8/site-packages/torch/nn/parallel/parallel_apply.py", line 61, in _worker
    output = module(*input, **kwargs)
  File "/root/miniconda3/envs/vclimb/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1130, in _call_impl
    return forward_call(*input, **kwargs)
TypeError: forward() missing 1 required positional argument: 'input'

As far as I know and have investigated, this error seems to be related to the number of GPUs. And I didn't find any information about the number of GPUs in the papers and other materials. So I want to know whether you used a single GPU or multiple GPUs for your experiments.

Looking forward to your reply.

ojedaf commented 9 months ago

Hi @YeungChiu,

We appreciate your interest in our work. The experiments were carried out on a single GPU (v100).

Best, Andrés