downloading models - Githubissues

linnnff commented 5 months ago

Due to my network being disconnected, I would like to download to a local directory. Please let me know which models need to be downloaded and which directory to place them in after downloading.

hjsybyq commented 5 months ago

Same question as above

adithya-s-k commented 5 months ago

Download Models: If you want to download the models before starting the server

python download.py --documents --media --web

--documents: Load in all the models that help you parse and ingest documents (Surya OCR series of models and Florence-2).
--media: Load in Whisper model to transcribe audio and video files.
--web: Set up selenium crawler.

As we are using huggingface to download these models which has a chaching mechanism To find the downloaded models you can go to

Cache setup Pretrained models are downloaded and locally cached at: ~/.cache/huggingface/hub. This is the default directory given by the shell environment variable TRANSFORMERS_CACHE. On Windows, the default directory is given by C:\Users\username.cache\huggingface\hub. You can change the shell environment variables shown below - in order of priority - to specify a different cache directory:

Shell environment variable (default): HUGGINGFACE_HUB_CACHE or TRANSFORMERS_CACHE.
Shell environment variable: HF_HOME.
Shell environment variable: XDG_CACHE_HOME + /huggingface.

source

NBdadasama commented 4 months ago

A configuration file or python's setting.py file is need. Support for modifying the model link in the configuration file to support referencing local models.

adithya-s-k / omniparse

downloading models #25