Closed dhea1323 closed 2 years ago
Already solved
Already solvedAlready solved
How you solved this problem?
Already solved
bro, how did you solve this ??
You can check feature_extractor.size
to see the size that will be used when resizing images. Note that a multimodal model like TrOCR consists of a feature extractor for preparing the images, and a tokenizer for preparing the text targets. A processor combines both.
Hello @NielsRogge and community,
I tried to combine
microsoft/beit-base-patch16-224
andcahya/roberta-base-indonesian-1.5G
just by changing the following code which is in Fine_tune_TrOCR_on_IAM_Handwriting_Database_using_native_PyTorch.ipynb with my own datasetThe following error occurs:
ValueError: Input image size (384*384) doesn't match model (224*224)
Is there anything else that must be adjusted? or Is there any mistake I did?, because when I try to run the file with my own dataset without changing anything, no error occurs.
Thank you.