Offloader issue - Githubissues

sadatamit17 commented 8 months ago

There is a is issue while running the command python main.py --config config/CHAMELEON_LLaVA1.5.yaml its asking for a offload folder, even though we added the offload folder and ran the command, but after loading the checkpoint shards 33% its getting killed. will you please help me how to configure this problem.

lwpyh commented 8 months ago

Hi there,

Thank you for your interest in our paper. Unfortunately, we have never encountered a similar issue. Could you provide us with the specific error you are encountering during execution?

Best

sadatamit17 commented 8 months ago

While i am running the code , the below value-error is coming.

"python main.py --config config/CHAMELEON_LLaVA1.5.yaml"

2024-01-18 10:46:01 dataset size: 76 llava pretrained model: liuhaotian/llava-v1.5-13b Traceback (most recent call last): File "/home/amit/Downloads/G2/GenSAM-main/main.py", line 89, in tokenizer, model, image_processor, context_len = load_pretrained_model(model_path, model_args.model_base, model_args.model_name) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/amit/Downloads/G2/GenSAM-main/llava/model/builder.py", line 103, in load_pretrained_model model = LlavaLlamaForCausalLM.from_pretrained(model_path, low_cpu_mem_usage=True, **kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/amit/Downloads/G2/GenSAM-main/GenSAM_LLaVA/lib/python3.11/site-packages/transformers/modeling_utils.py", line 2903, in from_pretrained ) = cls._load_pretrained_model( ^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/amit/Downloads/G2/GenSAM-main/GenSAM_LLaVA/lib/python3.11/site-packages/transformers/modeling_utils.py", line 3002, in _load_pretrained_model raise ValueError( ValueError: The current device_map had weights offloaded to the disk. Please provide an offload_folder for them. Alternatively, make sure you have safetensors installed if the model you are using offers the weights in this format.

And after the offloader adding in bulider.py it showing

Loading checkpoint shards: 33%|██████████████████████████▋ | 1/3 [00:05<00:10, 5.23s/it]Killed

Now, i ran through the errors by google bard and chatgpt, it says the available ram in my computer isn't enough. so accommodate this, i tried the system is to put the offload certain weights to disk, in a mentioned the desalinated location.

But I am not sure why this occurring. first time ran this i saw the shards downloaded or maybe ran.

lwpyh commented 8 months ago

Hi there,

We haven't encountered with this issue, seems like it is the problem of the size of the cache, hope this information is helpful to you.

jyLin8100 / GenSAM

Offloader issue #3