faustomorales / vit-keras

Keras implementation of ViT (Vision Transformer)
Apache License 2.0
329 stars 78 forks source link

Unknown error code running model.fit #17

Closed Sicily-F closed 3 years ago

Sicily-F commented 3 years ago

Hi there,

Thank you SO much for this package! I've been trying it out via this Kaggle tutorial:

[https://www.kaggle.com/raufmomin/vision-transformer-vit-fine-tuning?select=test_images]

On my own data, with 36 classes, and I keep getting this error:

Resizing position embeddings from 12 to 7 warnings.warn(

So far, the model also is too big to run on tensorflow-cpu, any pointers?

ancesstor2 commented 3 years ago

Hi,try to decrease the batch_size. This model really consumes memory.

faustomorales commented 3 years ago

Resizing position embeddings from 12 to 7 is not an error. Rather, it is a warning to let you know that position embeddings are getting resized because the model weights were saved in an architecture with a different input size. This is okay, and expected when using different input sizes. But if it happens when you don't expect it (i.e., if you thought you were using a version of the model with the same input size as what the weights were trained with), it can point to a problem. You can suppress the warning using warnings.filterwarnings if you would like. The official implementation does this resizing here.

Please open a separate issue with more details (hardware, batch size, input size, etc.) for the sizing issue. But I agree with the suggestion made by @ancesstor2. This model does demand a great deal of memory.