Add a vit-based ocr model to hugging face

wdp-007 commented 2 years ago

Model description

We want to add MGPSTR model(ECCV 2022) to hugging face. MGPSTR is a ViT(Vision Transformer)-based pure vision model for STR, which shows its superiority in recognition accuracy. It has a Multi-Granularity Prediction (MGP) strategy to inject information from the language modality. MGPSTR algorithm achieves state-of-the-art performance. We followed the guidance of https://github.com/huggingface/transformers/tree/main/templates/adding_a_new_model, but encountered some problems, such as being unable to find a suitable huggingface-hub version when installing the environment.

ERROR: Could not find a version that satisfies the requirement huggingface-hub<1.0,>=0.8.1 (from transformers[dev]) (from versions: 0.0.1, 0.0.2, 0.0.3rc1, 0.0.3rc2, 0.0.5, 0.0.6, 0.0.7, 0.0.8, 0.0.9, 0.0.10, 0.0.11, 0.0.12, 0.0.13, 0.0.14, 0.0.15, 0.0.16, 0.0.17, 0.0.18, 0.0.19, 0.1.0, 0.1.1, 0.1.2, 0.2.0, 0.2.1, 0.4.0)
ERROR: No matching distribution found for huggingface-hub<1.0,>=0.8.1

Can I get some help or guidance?

Open source status

[X] The model implementation is available
[X] The model weights are available

Provide useful links for the implementation

The paper will be published soon.

NielsRogge commented 2 years ago

Sure, do you have an email address? We can set up a slack channel for easier communication

wdp-007 commented 2 years ago

Yes, my email address is wdp0072012@gmail.com. Do I need to register for slack in advance?

NielsRogge commented 2 years ago

You should have received an invite by email :)

huggingface / transformers