microsoft / unilm

Large-scale Self-supervised Pre-training Across Tasks, Languages, and Modalities
https://aka.ms/GeneralAI
MIT License
19.81k stars 2.52k forks source link

DIT Text Detection Inference #1154

Open arvisioncode opened 1 year ago

arvisioncode commented 1 year ago

I would like to perform a simple inference from the dit model for the text detection you give, and an input image

The readme of this component only details how to do fine-tuning or evaluation. With the idea of being able to make an inference, I have based myself on the evaluation example, I have downloaded some of the models that you give and I have tried to run the evaluation like this:

$ python train_net.py --config-file configs/mask_rcnn_dit_base.yaml --eval-only --resume MODEL.WEIGHTS ../../.models/funsd_dit-b_mrcnn.zip OUTPUT_DIR output

The problem is that I don't know how to insert the funsd dataset to make it work, can you give me some advice on the subject? Is there any other more straightforward way to make a simple inference on any image?

rm-asif-amin commented 8 months ago

Facing the same problem, did you get a solution?