Open hegdekartik opened 1 year ago
For the first error, can you provide more error logs? For the second error, you were missing the argument --teacher_path
, and the entire command is CUDA_VISIBLE_DEVICES=0 python assign_answer.py --dataset [cpv2/v2] --name number --split high --teacher_path []
mentioned in README.md
.
For the second error, --teacher_path
was an optional argument. So we added the model.pth into the correct folder mentioned in the assign_answer.py
, which is './logs/lmh_css/model.pth
.
Could you please provide the correct link to the right model.pth
for this step?
Error logs for the first error :
Building train dataset...
caching-features: 100%|████████████████████████████████████| 443757/443757 [38:56<00:00, 189.96it/s]
tokenize: 100%|█████████████████████████████████████████| 443757/443757 [00:03<00:00, 119740.31it/s]
tensorize: 100%|████████████████████████████████████████| 443757/443757 [00:04<00:00, 106497.75it/s]
Building test dataset...
caching-features: 100%|████████████████████████████████████| 214354/214354 [18:59<00:00, 188.16it/s]
tokenize: 100%|██████████████████████████████████████████| 214354/214354 [00:04<00:00, 48356.19it/s]
tensorize: 100%|████████████████████████████████████████| 214354/214354 [00:01<00:00, 109298.11it/s]
Starting training...
Epoch 1: 0%| | 0/867 [00:00<?, ?it/s]/home/kartik/.conda/envs/BLIP_env/lib/python3.10/site-packages/torch/nn/functional.py:1967: UserWarning: nn.functional.sigmoid is deprecated. Use torch.sigmoid instead.
warnings.warn("nn.functional.sigmoid is deprecated. Use torch.sigmoid instead.")
Epoch 1: 0%| | 0/867 [00:00<?, ?it/s]
Traceback (most recent call last):
File "/mnt/44b643af-38ed-4d24-abcc-00e81b36025c/kartik/KDDAug/main.py", line 178, in <module>
main()
File "/mnt/44b643af-38ed-4d24-abcc-00e81b36025c/kartik/KDDAug/main.py", line 175, in main
train(model, train_loader, eval_loader, args,qid2type)
File "/mnt/44b643af-38ed-4d24-abcc-00e81b36025c/kartik/KDDAug/train.py", line 280, in train
visual_grad = torch.autograd.grad((pred * (a > 0).float()).sum(), v, create_graph=True)[0]
File "/home/kartik/.conda/envs/BLIP_env/lib/python3.10/site-packages/torch/autograd/__init__.py", line 300, in grad
return Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass
RuntimeError: one of the variables needed for gradient computation has been modified by an inplace operation: [torch.cuda.FloatTensor [512, 2048]], which is output 0 of ReluBackward0, is at version 1; expected version 0 instead. Hint: enable anomaly detection to find the operation that failed to compute its gradient, with torch.autograd.set_detect_anomaly(True).
Hi,
I am still having this issue. Can you please check and help me resolve this issue? Thank you.
Hi,
Thanks for the great work. I found your work interesting, so I wanted to try this out. But in 'KD-based Answer Assignment', we are getting errors.
We are getting the following error when we run the following command:
So we tried the other way given, which is using a pretrained teacher model (CSS) download from CSS-VQA. But unfortunately, after downloading 'model.pth' and running 'Assign new answer' command we got error as below.
How can I get rid of this error?
Thank you