ragavsachdeva / magi

Generate a transcript for your favourite Manga: Detect manga characters, text blocks and panels. Order panels. Cluster characters. Match texts to their speakers. Perform OCR.
275 stars 7 forks source link

Can not load the file locally. #3

Closed St1pn closed 5 months ago

St1pn commented 5 months ago

Background: I have already downloaded the following model files from huggingface.co: 1.ragavsachdeva/magi 2.conditional-detr-resnet-50/ 3.vit-mae-base/ 4.trocr-base-printed/ They were all put under the same directory call Magi(Usage py file was also under this directory). I have also modified the "_name_or_path" in ocr_pretrained_processor_path, detection_model_config, crop_embedding_model_config in magi/config.json to "./conditional-detr-resnet-50", "./vit-mae-base", "./trocr-base-printed". ( I haven't modified the "_name_or_path" in our_model_config in magi/config.josn, which was originally set as "/work/rs/logs/manga_ocr/nt8rn2ul/". what does this param represent or mean?)

Problem: when I tried to run the magi following the Usage part in the README file, I specified the model path in from_pretrained() as "./magi"I encountered this Warning: Could not find image processor class in the image processor config or the model config. Loading based on pattern matching with the model's feature extractor configuration. and the py was still trying to get the model files from hugging face.co instead of loading the model locally (Raise huggingface_hub.utils._errors.LocalEntryNotFoundError)

Can you give a demo about how to load the file locally? (model files, location, parameter settings)

ragavsachdeva commented 5 months ago

Hi,

I am assuming you're able to perform inference using the provided example usage script, except you now want source code to be local on your machine.

I am afraid I can't spoon-feed you how to achieve what you're trying to do, but here is what I recommend. Start simple. Rather than downloading ragavsachdeva/magi + conditional-detr-resnet-50 + vit-mae-base + trocr-base-printed into a single folder, do this instead:

Note that even though this should run from the local source files, they still have dependencies and imports from the transformers library. So you'll see this warning for example:

Could not find image processor class in the image processor config or the model config. Loading based on pattern matching with the model's feature extractor configuration.

which comes from this from_pretrained call in the processor. Similarly, this line will instantiate the ViTMAE model by making a call to the transformers library. If you want to have the source files for all these models locally, you'll need to incrementally make local copies of their modelling files from HF transformers and correctly update all the imports.

Regarding, "/work/rs/logs/manga_ocr/nt8rn2ul/", it does nothing anymore (I've now updated the config to remove this path with another dummy placeholder).