Open codeolder opened 4 months ago
what GPU you have? we have auto installer and a version that works as low as 8 GB (FP8 + tiled VAE + cpu offloading)
The GPU I use is RTX 4060, can please provide that auto installer I have installed this project many times and it has always failed
The GPU I use is RTX 4060, can please provide that auto installer I have installed this project many times and it has always failed
here our video
it would work super on your GPU
I tried registering an account and participating in the link in the video mentioned, can you send me the file or does this cost money?
@codeolder on a 40g a100, I make it work by set load_4bit=True
in test.py
Is there a baseline config that works with 24gb (using test.py
or a sanely modified version)? Without reading the sources or paper, I don't have a sense of what adjustments to make to decrease memory usage.
Code.txt BasicTransformerBlock is using checkpointing Loaded model config from [options/SUPIR_v0.yaml] Loaded state_dict from [/opt/data/private/AIGC_pretrain/SDXL_cache/sd_xl_base_1.0_0.9vae.safetensors] Loaded state_dict from [/opt/data/private/AIGC_pretrain/SUPIR_cache/SUPIR-v0Q.ckpt] Loading vision tower: openai/clip-vit-large-patch14-336 Loading checkpoint shards: 67%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████ | 2/3 [00:23<00:11, 11.70s/it] ╭─────────────────────────────── Traceback (most recent call last) ────────────────────────────────╮ │ C:\Users\PC\Downloads\SUPIR\test.py:72 in │
│ │
│ 69 model = model.to(SUPIR_device) │
│ 70 # load LLaVA │
│ 71 if use_llava: │
│ ❱ 72 │ llava_agent = LLavaAgent(LLAVA_MODEL_PATH, device=LLaVA_device, load8bit=args.load │
│ 73 else: │
│ 74 │ llava_agent = None │
│ 75 │
│ │
│ C:\Users\PC\Downloads\SUPIR\llava\llava_agent.py:27 in init │
│ │
│ 24 │ │ │ device_map = 'auto' │
│ 25 │ │ model_path = os.path.expanduser(model_path) │
│ 26 │ │ model_name = get_model_name_from_path(model_path) │
│ ❱ 27 │ │ tokenizer, model, image_processor, context_len = load_pretrained_model( │
│ 28 │ │ │ model_path, None, model_name, device=self.device, device_map=device_map, │
│ 29 │ │ │ load_8bit=load_8bit, load_4bit=load_4bit) │
│ 30 │ │ self.model = model │
│ │
│ C:\Users\PC\Downloads\SUPIR\llava\model\builder.py:103 in load_pretrained_model │
│ │
│ 100 │ │ │ │ model = LlavaMPTForCausalLM.from_pretrained(model_path, low_cpu_mem_usag │
│ 101 │ │ │ else: │
│ 102 │ │ │ │ tokenizer = AutoTokenizer.from_pretrained(model_path, use_fast=False) │
│ ❱ 103 │ │ │ │ model = LlavaLlamaForCausalLM.from_pretrained(model_path, low_cpu_mem_us │
│ 104 │ else: │
│ 105 │ │ # Load language model │
│ 106 │ │ if model_base is not None: │
│ │
│ C:\Users\PC\AppData\Local\Programs\Python\Python310\lib\site-packages\transformers\modeling_util │
│ s.py:2795 in from_pretrained │
│ │
│ 2792 │ │ │ │ mismatched_keys, │
│ 2793 │ │ │ │ offload_index, │
│ 2794 │ │ │ │ error_msgs, │
│ ❱ 2795 │ │ │ ) = cls._load_pretrained_model( │
│ 2796 │ │ │ │ model, │
│ 2797 │ │ │ │ state_dict, │
│ 2798 │ │ │ │ loaded_state_dict_keys, # XXX: rename? │
│ │
│ C:\Users\PC\AppData\Local\Programs\Python\Python310\lib\site-packages\transformers\modeling_util │
│ s.py:3123 in _load_pretrained_model │
│ │
│ 3120 │ │ │ │ ) │
│ 3121 │ │ │ │ │
│ 3122 │ │ │ │ if low_cpu_mem_usage: │
│ ❱ 3123 │ │ │ │ │ new_error_msgs, offload_index, state_dict_index = _load_state_dict_i │
│ 3124 │ │ │ │ │ │ model_to_load, │
│ 3125 │ │ │ │ │ │ state_dict, │
│ 3126 │ │ │ │ │ │ loaded_keys, │
│ │
│ C:\Users\PC\AppData\Local\Programs\Python\Python310\lib\site-packages\transformers\modeling_util │
│ s.py:698 in _load_state_dict_into_meta_model │
│ │
│ 695 │ │ │ state_dict_index = offload_weight(param, param_name, state_dict_folder, stat │
│ 696 │ │ elif not load_in_8bit: │
│ 697 │ │ │ # For backward compatibility with older versions of
accelerate
│ │ ❱ 698 │ │ │ set_module_tensor_to_device(model, param_name, param_device, **set_module_kw │ │ 699 │ │ else: │ │ 700 │ │ │ if param.dtype == torch.int8 and param_name.replace("weight", "SCB") in stat │ │ 701 │ │ │ │ fp16_statistics = state_dict[param_name.replace("weight", "SCB")] │ │ │ │ C:\Users\PC\AppData\Local\Programs\Python\Python310\lib\site-packages\accelerate\utils\modeling. │ │ py:149 in set_module_tensor_to_device │ │ │ │ 146 │ │ if value is None: │ │ 147 │ │ │ new_value = old_value.to(device) │ │ 148 │ │ elif isinstance(value, torch.Tensor): │ │ ❱ 149 │ │ │ new_value = value.to(device) │ │ 150 │ │ else: │ │ 151 │ │ │ new_value = torch.tensor(value, device=device) │ │ 152 │ ╰──────────────────────────────────────────────────────────────────────────────────────────────────╯ OutOfMemoryError: CUDA out of memory. Tried to allocate 50.00 MiB. GPU