haotian-liu / LLaVA

[NeurIPS'23 Oral] Visual Instruction Tuning (LLaVA) built towards GPT-4V level capabilities and beyond.
https://llava.hliu.cc
Apache License 2.0
19.33k stars 2.13k forks source link

How to load LLaVA on a server with no Internet connection? #348

Closed gnimyang closed 1 year ago

gnimyang commented 1 year ago

When did you clone our code?

I cloned the code base after 5/1/23

Describe the issue

I manually download the pre-trained model at my path, here, which click the download button for each. image ,and then I set my worker model path is here image the python worker path is here: python -m llava.serve.model_worker --host 0.0.0.0 --controller http://localhost:10000 --port 40000 --worker http://localhost:40000 --model-path C:\Users\tx\Documents\LLaVA-main\llava-v1-0719-336px-lora-vicuna-13b-v1.3 image how to solve this problem. I don't know weather why pre-trained model utilize correctly.

gnimyang commented 1 year ago

I already solve this problem however, I want to know how to operate pre-trained models. How many pre-trained models need I to download, I alrady download https://huggingface.co/liuhaotian/llava-v1-0719-336px-lora-merge-vicuna-13b-v1.3/tree/main model, however the code still give me feedback about I need connect to internet. Due to machine limit, I have to put the model into a remote computer, this remote computer (HPC) cannot link the internet, can you help me explain the pretrained models I need to download and how operate them.

when I running the command of <python -m llava.serve.model_worker --host 0.0.0.0 --controller http://localhost:10000/ --port 40000 --worker http://localhost:40000/ --model-path ./checkpoints/LLaVA-13B-v0> " 2023-08-04 19:42:46 | INFO | model_worker | args: Namespace(host='0.0.0.0', port=40000, worker_address='http://localhost:40000/', controller_address='http://localhost:10000/', model_path='./checkpoints', model_base=None, model_name=None, multi_modal=False, limit_model_concurrency=5, stream_interval=1, no_register=False, load_8bit=False, load_4bit=False) 2023-08-04 19:42:46 | INFO | model_worker | Loading the model checkpoints on worker 052325 ... You are using the legacy behaviour of the <class 'transformers.models.llama.tokenization_llama.LlamaTokenizer'>. This means that tokens that come after special tokens will not be properly handled. We recommend you to read the related pull request available at https://github.com/huggingface/transformers/pull/24565 2023-08-04 19:42:48 | ERROR | stderr | Traceback (most recent call last): 2023-08-04 19:42:48 | ERROR | stderr | File "/public/home/v-yumy/anaconda3/envs/llava/lib/python3.10/site-packages/transformers/utils/hub.py", line 417, in cached_file 2023-08-04 19:42:48 | ERROR | stderr | resolved_file = hf_hub_download( 2023-08-04 19:42:48 | ERROR | stderr | File "/public/home/v-yumy/anaconda3/envs/llava/lib/python3.10/site-packages/huggingface_hub/utils/_validators.py", line 118, in _inner_fn 2023-08-04 19:42:48 | ERROR | stderr | return fn(args, kwargs) 2023-08-04 19:42:48 | ERROR | stderr | File "/public/home/v-yumy/anaconda3/envs/llava/lib/python3.10/site-packages/huggingface_hub/file_download.py", line 1291, in hf_hub_download 2023-08-04 19:42:48 | ERROR | stderr | raise LocalEntryNotFoundError( 2023-08-04 19:42:48 | ERROR | stderr | huggingface_hub.utils._errors.LocalEntryNotFoundError: Connection error, and we cannot find the requested files in the disk cache. Please try again or make sure your Internet connection is on. 2023-08-04 19:42:48 | ERROR | stderr | 2023-08-04 19:42:48 | ERROR | stderr | During handling of the above exception, another exception occurred: 2023-08-04 19:42:48 | ERROR | stderr | 2023-08-04 19:42:48 | ERROR | stderr | Traceback (most recent call last): 2023-08-04 19:42:48 | ERROR | stderr | File "/public/home/v-yumy/anaconda3/envs/llava/lib/python3.10/runpy.py", line 196, in _run_module_as_main 2023-08-04 19:42:48 | ERROR | stderr | return _run_code(code, main_globals, None, 2023-08-04 19:42:48 | ERROR | stderr | File "/public/home/v-yumy/anaconda3/envs/llava/lib/python3.10/runpy.py", line 86, in _run_code 2023-08-04 19:42:48 | ERROR | stderr | exec(code, run_globals) 2023-08-04 19:42:48 | ERROR | stderr | File "/public/home/v-yumy/Pycharm_Project/LLaVA/llava/serve/model_worker.py", line 273, in 2023-08-04 19:42:48 | ERROR | stderr | worker = ModelWorker(args.controller_address, 2023-08-04 19:42:48 | ERROR | stderr | File "/public/home/v-yumy/Pycharm_Project/LLaVA/llava/serve/model_worker.py", line 64, in init 2023-08-04 19:42:48 | ERROR | stderr | self.tokenizer, self.model, self.image_processor, self.context_len = load_pretrained_model( 2023-08-04 19:42:48 | ERROR | stderr | File "/public/home/v-yumy/Pycharm_Project/LLaVA/llava/model/builder.py", line 121, in load_pretrained_model 2023-08-04 19:42:48 | ERROR | stderr | model = AutoModelForCausalLM.from_pretrained(model_path, low_cpu_mem_usage=True, kwargs) 2023-08-04 19:42:48 | ERROR | stderr | File "/public/home/v-yumy/anaconda3/envs/llava/lib/python3.10/site-packages/transformers/models/auto/auto_factory.py", line 493, in from_pretrained 2023-08-04 19:42:48 | ERROR | stderr | return model_class.from_pretrained( 2023-08-04 19:42:48 | ERROR | stderr | File "/public/home/v-yumy/anaconda3/envs/llava/lib/python3.10/site-packages/transformers/modeling_utils.py", line 2700, in from_pretrained 2023-08-04 19:42:48 | ERROR | stderr | model = cls(config, model_args, model_kwargs) 2023-08-04 19:42:48 | ERROR | stderr | File "/public/home/v-yumy/Pycharm_Project/LLaVA/llava/model/language_model/llava_llama.py", line 46, in init 2023-08-04 19:42:48 | ERROR | stderr | self.model = LlavaLlamaModel(config) 2023-08-04 19:42:48 | ERROR | stderr | File "/public/home/v-yumy/Pycharm_Project/LLaVA/llava/model/language_model/llava_llama.py", line 38, in init 2023-08-04 19:42:48 | ERROR | stderr | super(LlavaLlamaModel, self).init(config) 2023-08-04 19:42:48 | ERROR | stderr | File "/public/home/v-yumy/Pycharm_Project/LLaVA/llava/model/llava_arch.py", line 32, in init 2023-08-04 19:42:48 | ERROR | stderr | self.vision_tower = build_vision_tower(config, delay_load=True) 2023-08-04 19:42:48 | ERROR | stderr | File "/public/home/v-yumy/Pycharm_Project/LLaVA/llava/model/multimodal_encoder/builder.py", line 7, in build_vision_tower 2023-08-04 19:42:48 | ERROR | stderr | return CLIPVisionTower(vision_tower, args=vision_tower_cfg, kwargs) 2023-08-04 19:42:48 | ERROR | stderr | File "/public/home/v-yumy/Pycharm_Project/LLaVA/llava/model/multimodal_encoder/clip_encoder.py", line 20, in init 2023-08-04 19:42:48 | ERROR | stderr | self.cfg_only = CLIPVisionConfig.from_pretrained(self.vision_tower_name) 2023-08-04 19:42:48 | ERROR | stderr | File "/public/home/v-yumy/anaconda3/envs/llava/lib/python3.10/site-packages/transformers/models/clip/configuration_clip.py", line 239, in from_pretrained 2023-08-04 19:42:48 | ERROR | stderr | config_dict, kwargs = cls.get_config_dict(pretrained_model_name_or_path, kwargs) 2023-08-04 19:42:48 | ERROR | stderr | File "/public/home/v-yumy/anaconda3/envs/llava/lib/python3.10/site-packages/transformers/configuration_utils.py", line 617, in get_config_dict 2023-08-04 19:42:48 | ERROR | stderr | config_dict, kwargs = cls._get_config_dict(pretrained_model_name_or_path, kwargs) 2023-08-04 19:42:48 | ERROR | stderr | File "/public/home/v-yumy/anaconda3/envs/llava/lib/python3.10/site-packages/transformers/configuration_utils.py", line 672, in _get_config_dict 2023-08-04 19:42:48 | ERROR | stderr | resolved_config_file = cached_file( 2023-08-04 19:42:48 | ERROR | stderr | File "/public/home/v-yumy/anaconda3/envs/llava/lib/python3.10/site-packages/transformers/utils/hub.py", line 452, in cached_file 2023-08-04 19:42:48 | ERROR | stderr | raise EnvironmentError( 2023-08-04 19:42:48 | ERROR | stderr | OSError: We couldn't connect to 'https://huggingface.co/' to load this file, couldn't find it in the cached files and it looks like openai/clip-vit-large-patch14-336 is not the path to a directory containing a file named config.json. 2023-08-04 19:42:48 | ERROR | stderr | Checkout your internet connection or see how to run the library in offline mode at 'https://huggingface.co/docs/transformers/installation#offline-mode'.

"

which pre-trained model need to download?

gnimyang commented 1 year ago

how can I manually add openai/clip-vit-large-patch14-336.

haotian-liu commented 1 year ago

Hi, for offline machines, you can download the corresponding CLIP weights from Hugging Face. For CLIP 336px, it is https://huggingface.co/openai/clip-vit-large-patch14-336.

Set the option of mm_vision_tower in config.json to your local dir where CLIP encoder is stored at.

Also, please use the merged weights, unless you have Vicuna weights locally and pass the path of Vicuna by --model-base.

gnimyang commented 1 year ago

image

I put the file here from: image

Set the option of mm_vision_tower in config.json to your local dir where CLIP encoder is stored at here. /public/home/v-yumy/Pycharm_Project/LLaVA/checkpoints/openai-clip-vit-large-patch14-336/

image

but it still not work

2023-08-05 19:38:14 | INFO | model_worker | args: Namespace(host='0.0.0.0', port=40000, worker_address='http://localhost:40000', controller_address='http://localhost:10000', model_path='./checkpoints/LLaVA-vicuna-13B-v1.3', model_base=None, model_name=None, multi_modal=False, limit_model_concurrency=5, stream_interval=1, no_register=False, load_8bit=False, load_4bit=False) 2023-08-05 19:38:14 | INFO | model_worker | Loading the model LLaVA-vicuna-13B-v1.3 on worker e019a2 ... You are using the legacy behaviour of the <class 'transformers.models.llama.tokenization_llama.LlamaTokenizer'>. This means that tokens that come after special tokens will not be properly handled. We recommend you to read the related pull request available at https://github.com/huggingface/transformers/pull/24565 2023-08-05 19:38:14 | ERROR | stderr | Traceback (most recent call last): 2023-08-05 19:38:14 | ERROR | stderr | File "/public/home/v-yumy/anaconda3/envs/llava/lib/python3.10/runpy.py", line 196, in _run_module_as_main 2023-08-05 19:38:14 | ERROR | stderr | return _run_code(code, main_globals, None, 2023-08-05 19:38:14 | ERROR | stderr | File "/public/home/v-yumy/anaconda3/envs/llava/lib/python3.10/runpy.py", line 86, in _run_code 2023-08-05 19:38:14 | ERROR | stderr | exec(code, run_globals) 2023-08-05 19:38:14 | ERROR | stderr | File "/public/home/v-yumy/Pycharm_Project/LLaVA/llava/serve/model_worker.py", line 273, in 2023-08-05 19:38:14 | ERROR | stderr | worker = ModelWorker(args.controller_address, 2023-08-05 19:38:14 | ERROR | stderr | File "/public/home/v-yumy/Pycharm_Project/LLaVA/llava/serve/model_worker.py", line 64, in init 2023-08-05 19:38:14 | ERROR | stderr | self.tokenizer, self.model, self.image_processor, self.context_len = load_pretrained_model( 2023-08-05 19:38:14 | ERROR | stderr | File "/public/home/v-yumy/Pycharm_Project/LLaVA/llava/model/builder.py", line 100, in load_pretrained_model 2023-08-05 19:38:14 | ERROR | stderr | model = LlavaLlamaForCausalLM.from_pretrained(model_path, low_cpu_mem_usage=True, *kwargs) 2023-08-05 19:38:14 | ERROR | stderr | File "/public/home/v-yumy/anaconda3/envs/llava/lib/python3.10/site-packages/transformers/modeling_utils.py", line 2700, in from_pretrained 2023-08-05 19:38:14 | ERROR | stderr | model = cls(config, model_args, **model_kwargs) 2023-08-05 19:38:14 | ERROR | stderr | File "/public/home/v-yumy/Pycharm_Project/LLaVA/llava/model/language_model/llava_llama.py", line 46, in init 2023-08-05 19:38:14 | ERROR | stderr | self.model = LlavaLlamaModel(config) 2023-08-05 19:38:14 | ERROR | stderr | File "/public/home/v-yumy/Pycharm_Project/LLaVA/llava/model/language_model/llava_llama.py", line 38, in init 2023-08-05 19:38:14 | ERROR | stderr | super(LlavaLlamaModel, self).init(config) 2023-08-05 19:38:14 | ERROR | stderr | File "/public/home/v-yumy/Pycharm_Project/LLaVA/llava/model/llava_arch.py", line 32, in init 2023-08-05 19:38:14 | ERROR | stderr | self.vision_tower = build_vision_tower(config, delay_load=True) 2023-08-05 19:38:14 | ERROR | stderr | File "/public/home/v-yumy/Pycharm_Project/LLaVA/llava/model/multimodal_encoder/builder.py", line 9, in build_vision_tower 2023-08-05 19:38:14 | ERROR | stderr | raise ValueError(f'Unknown vision tower: {vision_tower}') 2023-08-05 19:38:14 | ERROR | stderr | ValueError: Unknown vision tower: /public/home/v-yumy/Pycharm_Project/LLaVA/checkpoints/openai-clip-vit-large-patch14-336

the tower cannot identify it as a model, why?

gnimyang commented 1 year ago

Is something worry? the local dir of openai-clip-vit-large-patch14-336 cannot load in mm-vision-tower

gnimyang commented 1 year ago

Additionally, I am using merge model. image image

but it still not work

haotian-liu commented 1 year ago

One alternative I can think of is to keep the config.json the same (as original), and put the folder of the vision encoder in the openai/clip-vit-large-patch14-336 folder. Basically, we want the model to be able to find this folder by directly navigating from the LLaVA project root (and you execute the command in the LLaVA folder as well)

gnimyang commented 1 year ago

thanks, it work when I put the model in the LLaVA project folder. image

gnimyang commented 1 year ago

Thanks! The solution is so easy!

haotian-liu commented 1 year ago

Glad to hear that the problem is solved.