Closed NoushNabi closed 7 months ago
The error message you're encountering indicates that convert.py is attempting to convert the microsoft/trocr-base-printed model into an AutoModelForCausalLM, which is typically used for language modeling. However, TrOCR is a model designed for text recognition (OCR), and thus, should not be converted to a causal language model.
Instead, you should use the appropriate class or function to load the TrOCR model. The transformers library provides classes specifically for loading models of certain types. For TrOCR, you should utilize the AutoModelForVisionAndLanguage or VisionEncoderDecoderModel class.
Here's an example of how you might do this: from transformers import VisionEncoderDecoderModel, VisionEncoderDecoderConfig
config = VisionEncoderDecoderConfig.from_pretrained("microsoft/trocr-base-printed")
model = VisionEncoderDecoderModel.from_pretrained("microsoft/trocr-base-printed", config=config)
Make sure you're using the latest version of the transformers library, as the API may change with each update. If convert.py is a script provided by a third party, you may need to check its documentation or source code to see if it supports the TrOCR model or if modifications are required to support this type of model.
If you're trying to use convert.py to convert the model format (e.g., from PyTorch to TensorFlow), you may need to find a conversion tool that specifically supports TrOCR or perform the conversion manually. In some cases, conversion may not be directly supported, especially for non-standard model architectures. In such instances, you may need to contact the model maintainers or search for community-provided solutions.
Thank you for looking into this issue! Please let us know if you have any questions or require any help.
@NoushNabi llm_bench is not targeted to support trOCR, please use optimum-intel interface and optimum-cli tool for model export and inference
Context
Running convert.py for microsoft/trocr-base-printed, it gives the below error:
ValueError: Unrecognized configuration class <class 'transformers.models.vision_encoder_decoder.configuration_vision_encoder_decoder.VisionEncoderDecoderConfig'> for this kind of AutoModel: AutoModelForCausalLM. Model type should be one of BartConfig, BertConfig, BertGenerationConfig, BigBirdConfig, BigBirdPegasusConfig, BioGptConfig, BlenderbotConfig, BlenderbotSmallConfig, BloomConfig, CamembertConfig, LlamaConfig, CodeGenConfig, CohereConfig, CpmAntConfig, CTRLConfig, Data2VecTextConfig, ElectraConfig, ErnieConfig, FalconConfig, FuyuConfig, GemmaConfig, GitConfig, GPT2Config, GPT2Config, GPTBigCodeConfig, GPTNeoConfig, GPTNeoXConfig, GPTNeoXJapaneseConfig, GPTJConfig, LlamaConfig, MambaConfig, MarianConfig, MBartConfig, MegaConfig, MegatronBertConfig, MistralConfig, MixtralConfig, MptConfig, MusicgenConfig, MusicgenMelodyConfig, MvpConfig, OpenLlamaConfig, OpenAIGPTConfig, OPTConfig, PegasusConfig, PersimmonConfig, PhiConfig, PLBartConfig, ProphetNetConfig, QDQBertConfig, Qwen2Config, ReformerConfig, RemBertConfig, RobertaConfig, RobertaPreLayerNormConfig, RoCBertConfig, RoFormerConfig, RwkvConfig, Speech2Text2Config, StableLmConfig, Starcoder2Config, TransfoXLConfig, TrOCRConfig, WhisperConfig, XGLMConfig, XLMConfig, XLMProphetNetConfig, XLMRobertaConfig, XLMRobertaXLConfig, XLNetConfig, XmodConfig.
As you can see in the error message, it says Model type should be one of ... and "TrOCRConfig" is part of the list. So "trocr-base-printed" should be supported, but the conversion fails.
What needs to be done?
N/A
Example Pull Requests
No response
Resources
Contact points
N/A
Ticket
No response