blueaxis / Poricom

Optical character recognition in manga images. Manga OCR desktop application
GNU General Public License v3.0
276 stars 18 forks source link

Load model without internet connection #39

Closed sg2013test closed 1 year ago

sg2013test commented 1 year ago

I am using Poricom, but constantly having issues with internet connection. I am getting error "Please try again or make sure your internet connection is on" with MangaOCR model.

Poricom / MangaOCR saves offline copy of the pretrained model in %UserProfile%.cache\huggingface\transformers. I checked the model is indeed there. I discovered that this is an issue with HuggingFace Transformers, even if the model is actually being stored in cache off-line, it still needs internet connection to load that model from it.

I also discovered that it should be possible to set TRANSFORMERS_OFFLINE=1 and HF_DATASETS_OFFLINE=1 to run image recognition in a firewalled or offline environment by only using local files.

Can someone please help me to implement it in Poricom, or tell me which files should I change to add those values and implement offline option.

blueaxis commented 1 year ago

constantly having issues with internet connection

Are you still getting internet connection issues with internet connection on? Did you download the latest release or did you build the app on your own?

Can someone please help me to implement it in Poricom, or tell me which files should I change to add those values and implement offline option.

I don't know if offline loading is possible. But IIRC, huggingface will need internet connection for the first time to download the model. On the next load, you will still need internet connection. I am not sure why but it might be because huggingface needs to verify the hash of the downloaded models (?)

wang-0201 commented 1 year ago

You can modify the code/MainWindow.py line:287

tracker.ocrModel = MangaOcr(pretrained_model_name_or_path=self.config['MANGA_MODULE_CACHE_PATH'])

and then modify the code/utils/config.toml, set

CHECK_INTERNET_POPUP = false

and add line

MANGA_MODULE_CACHE_PATH = "{Your pc download manga-ocr module path}"

just because you need manual set module path to init manga_ocr.

blueaxis commented 1 year ago

@wang-0201 thanks for the info

Unfortunately this will not work with the current main branch. Use the 0.4.1 tag instead.