RuntimeError on Colab Notebook, Training T5 on WikiSQL, RuntimeError: output with shape [16, 8, 1, 1] doesn't match the broadcast shape [16, 8, 1, 64]

mrm8488 / shared_colab_notebooks

A Repo to store the Google Colaboratory Notebooks that I have created and shared

264 stars 68 forks source link

RuntimeError on Colab Notebook, Training T5 on WikiSQL, RuntimeError: output with shape [16, 8, 1, 1] doesn't match the broadcast shape [16, 8, 1, 64] #11

Open eshehadi opened 1 year ago

eshehadi commented 1 year ago

I am running the colab notebook shared here:

https://github.com/mrm8488/shared_colab_notebooks/blob/bf6d578042bbb393e8cfcb336e2909c9f460b91c/T5_wikiSQL_multitask_with_HF_transformers.ipynb

When I get to trainer.evaluate() I get the following error message:

RuntimeError: output with shape [16, 8, 1, 1] doesn't match the broadcast shape [16, 8, 1, 64]

I've attempted to search for solutions, but I can't find many instances where this type of error comes up with NLP training. It seems to most often occur with image raster data.

I would greatly appreciate any insight that you may have. Thanks!

Eric

dharma610 commented 1 year ago

@mrm8488 I faced the same issue, could you please help us out here?

eshehadi commented 1 year ago

@mrm8488 I faced the same issue, could you please help us out here?

I think it has to do with the transformers version. I tried running the code on my local machine as opposed to Colab and had to downgrade transformers to an earlier version.

asksonu commented 1 year ago

@eshehadi , which transformers version has worked for you ? I tried 4.30.2, 4.29.0 and 4.30.0 (in google colab), all of them I was getting same error.

asksonu commented 1 year ago

I figured the problem was with padding and truncation of input and output tokens in the function convert_to_features. Error disappeared after I have replaced below

input_encodings = tokenizer.batch_encode_plus(example_batch['input'], pad_to_max_length=True, max_length=64)
target_encodings = tokenizer.batch_encode_plus(example_batch['target'], pad_to_max_length=True, max_length=64)

with

input_encodings = tokenizer.batch_encode_plus(example_batch['input'], truncation=True, padding="max_length", max_length=64)
target_encodings = tokenizer.batch_encode_plus(example_batch['target'],  truncation=True, padding="max_length", max_length=64)

calam1 commented 1 year ago

@eshehadi , which transformers version has worked for you ? I tried 4.30.2, 4.29.0 and 4.30.0 (in google colab), all of them I was getting same error.

I changed it to 4.26.0 to get past the shape error

calam1 commented 1 year ago

I figured the problem was with padding and truncation of input and output tokens in the function convert_to_features. Error disappeared after I have replaced below

input_encodings = tokenizer.batch_encode_plus(example_batch['input'], pad_to_max_length=True, max_length=64)
target_encodings = tokenizer.batch_encode_plus(example_batch['target'], pad_to_max_length=True, max_length=64)

with

input_encodings = tokenizer.batch_encode_plus(example_batch['input'], truncation=True, padding="max_length", max_length=64)
target_encodings = tokenizer.batch_encode_plus(example_batch['target'],  truncation=True, padding="max_length", max_length=64)

unfortunately for me, while running transform 4.30.2 (latest) making the padding change did not resolve my problem. I had to downgrade the transform version to 4.26.0 (the minor versions may work also, I did not try them)