Closed shreyaskar123 closed 11 months ago
Hey! I would suggest you to try to isolate the bug as we have limited timeframe to debug your custom code. If this is indeed a bug we can help you, otherwise the community forum is a good place to ask this!
@ArthurZucker: I believe this is a bug because most of the code in_get_item_
is from the provided example. Could you please look into this? I believe this has something to do with the git-base-vatex processor. Specifically, inside _get_item_
pixel_values is of shape torch.Size([1, 6, 3, 224, 224]) (dim = 5) and then torch_default_data_collator
is increasing the dimension to 6 via batch[k] = torch.stack([f[k] for f in features])
, causing the error. I tried to combat this by squeezing the first dimension in _get_item
and make the tensor of size torch.Size([6, 3, 224, 224]) but then for some reason inside _call_impl
in module.py
pixel_values isn't even a part of kwargs
when doing the forward_call
, causing an error. I get the exact same error when trying to squeeze the extra dimension inside the torch_default_data_collator
in data_collator.py
via the following code.
for k, v in first.items():
if k not in ("label", "label_ids") and v is not None and not isinstance(v, str):
if isinstance(v, torch.Tensor):
if k == 'pixel_values' and v.shape[0] == 1: # Add this condition
batch[k] = torch.stack([f[k].squeeze(0) for f in features])
else:
batch[k] = torch.stack([f[k] for f in features])
Any help would be greatly appreciated Thanks!
Hi @shreyaskar123 this is not a bug on our side, it's a bug on the data preparation side. You can fix it by removing the batch dimension which the processor creates by default.
@NielsRogge: I did try to remove the batch dimension (see https://github.com/huggingface/transformers/issues/26230#issuecomment-1724807694), but I get a error that pixel_values isn't part of kwargs anymore. Could you please take a look?
This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.
Please note that issues that do not follow the contributing guidelines are likely to be ignored.
System Info
transformers
version: 4.30.2Who can help?
@NielsRogge
Information
Tasks
examples
folder (such as GLUE/SQuAD, ...)Reproduction
At this point when I do the print the dimension is 5 (as expected). But when I print the dimension of
pixel_values
in the first line inforward
in filemodeling_git.py"
the dimension is 6. Because of this I get errorraise ValueError("pixel_values must be of rank 4 or 5") ValueError: pixel_values must be of rank 4 or 5
This is the full stack trace for reference:
Expected behavior
Ideally the dimension of
pixel_values
insideforward
would also be 5 and the finetuning of git-base-vatex on video would work This is a blocking issue and any help would be really appreciated!