Open wspspring opened 1 year ago
The training data is like below: [ { "image": "63.jpg", "caption": "Soup Can Tomato Colored, 1968 by Andy Warhol Art Print Offset Lithograph 24x36", "image_id": 1, "kwords": [ "cover art", "museum", "poster", "artwork", "gallery", "cover image", "1960s", "movie poster", "canvas print", "art print", "home decor", "image collections", "art", "opening", "picture" ] }, ]
When training on my own data, in Epoch 0, there will be an error below: Traceback (most recent call last): File "Retrieval.py", line 713, in
main(args, config)
File "Retrieval.py", line 607, in main
train_stats = train(model, train_loader, optimizer, tokenizer, epoch, warmup_steps, device, lr_scheduler, config)
File "Retrieval.py", line 68, in train
loss_ita, loss_itm = model(image, text_input,alpha=alpha, idx=idx)
File "/root/miniconda3/envs/vicha/lib/python3.8/site-packages/torch/nn/modules/module.py", line 889, in _call_impl
result = self.forward(*input, kwargs)
File "/root/miniconda3/envs/vicha/lib/python3.8/site-packages/torch/nn/parallel/distributed.py", line 705, in forward
output = self.module(*inputs[0], *kwargs[0])
File "/root/miniconda3/envs/vicha/lib/python3.8/site-packages/torch/nn/modules/module.py", line 889, in _call_impl
result = self.forward(input, kwargs)
File "ViCHA/models/model_retrieval_kw_img.py", line 240, in forward
neg_idx = torch.multinomial(weights_t2i[b], 1).item()
RuntimeError: probability tensor contains either
inf
,nan
or element < 0