ArrowLuo / CLIP4Clip

An official implementation for "CLIP4Clip: An Empirical Study of CLIP for End to End Video Clip Retrieval"
https://arxiv.org/abs/2104.08860
MIT License
889 stars 125 forks source link

Error AttributeError: 'list' object has no attribute 'shape' when training on one gpu #37

Closed zara0m closed 3 years ago

zara0m commented 3 years ago

Hello,

I am getting this error when trying to train the model on google Colab with one gpu. This occurs when the first epoch finished. Please let me know how to get around this error.

Here is the stack trace: 11/05/2021 10:51:52 - INFO - Epoch: 1/5, Step: 5600/5625, Lr: 0.000000091, Loss: 0.097879, Time/step: 13.179280 11/05/2021 10:57:31 - INFO - Epoch 1/5 Finished, Train Loss: 0.497687 Traceback (most recent call last): File "main_task_retrieval.py", line 564, in main() File "main_task_retrieval.py", line 548, in main R1 = eval_epoch(args, model, test_dataloader, device, n_gpu) File "main_task_retrieval.py", line 437, in eval_epoch logger.info("sim matrix size: {}, {}".format(sim_matrix.shape[0], sim_matrix.shape[1])) AttributeError: 'list' object has no attribute 'shape' Traceback (most recent call last): File "/usr/local/lib/python3.7/runpy.py", line 193, in _run_module_as_main "main", mod_spec) File "/usr/local/lib/python3.7/runpy.py", line 85, in _run_code exec(code, run_globals) File "/usr/local/lib/python3.7/site-packages/torch/distributed/launch.py", line 260, in main() File "/usr/local/lib/python3.7/site-packages/torch/distributed/launch.py", line 256, in main cmd=cmd) subprocess.CalledProcessError: Command '['/usr/local/bin/python', '-u', 'main_task_retrieval.py', '--local_rank=0', '--do_train', '--num_thread_reader=0', '--epochs=5', '--batch_size=32', '--n_display=50', '--train_csv', 'msrvtt_datasplit/MSRVTT_train.9k.csv', '--val_csv', 'msrvtt_datasplit/MSRVTT_JSFUSION_test.csv', '--data_path', 'msrvtt_datasplit/MSRVTT_data.json', '--features_path', '/content/drive/MyDrive/Compressed', '--output_dir', 'ckpts/ckpt_msrvtt_retrieval_looseType', '--lr', '1e-4', '--max_words', '32', '--max_frames', '12', '--batch_size_val', '4', '--datatype', 'msrvtt', '--expand_msrvtt_sentences', '--feature_framerate', '1', '--coef_lr', '1e-3', '--freeze_layer_num', '0', '--slice_framepos', '2', '--loose_type', '--linear_patch', '2d', '--sim_header', 'meanP', '--pretrained_clip_name', 'ViT-B/32']' returned non-zero exit status 1.

zara0m commented 3 years ago

I think this line of code was missed in this part and it works by adding this:

sim_matrix = np.concatenate(tuple(sim_matrix), axis=0)

Is it correct?

ArrowLuo commented 3 years ago

Hi @zara0m, good job, thanks. It is a bug when runing the else branch. We need to covert the list to numpy array sequencely. sim_matrix = np.concatenate(tuple(sim_matrix), axis=0)

zara0m commented 3 years ago

Thank you.