henk717 / KoboldAI

KoboldAI is generative AI software optimized for fictional use, but capable of much more!
http://koboldai.com
GNU Affero General Public License v3.0
369 stars 130 forks source link

Runpod problem with starting Pygmalion-7B #446

Open skaba04 opened 1 year ago

skaba04 commented 1 year ago

I have been running on runpod Pygmalion 13B model fine but whatever i am tring to load Pygmalion 7B or 6B the model is located here:https://huggingface.co/TehVenom/Pygmalion-7b-4bit-GPTQ-Safetensors

this error message showes up: 2023-08-28T23:04:34.578170425+02:00 File "/opt/koboldai/modeling/inference_models/gptq_hf_torch/class.py", line 238, in _load 2023-08-28T23:04:34.578171637+02:00 self.model = self._get_model(self.get_local_model_path()) 2023-08-28T23:04:34.578173024+02:00 │ │ │ │ │ └ <function HFInferenceModel.get_local_model_path at 0x7f543f713af0> 2023-08-28T23:04:34.578174162+02:00 │ │ │ │ └ <modeling.inference_models.gptq_hf_torch.class.model_backend object at 0x7f543f6b2130> 2023-08-28T23:04:34.578175309+02:00 │ │ │ └ <function model_backend._get_model at 0x7f543f70d670> 2023-08-28T23:04:34.578176464+02:00 │ │ └ <modeling.inference_models.gptq_hf_torch.class.model_backend object at 0x7f543f6b2130> 2023-08-28T23:04:34.578177571+02:00 │ └ None 2023-08-28T23:04:34.578178686+02:00 └ <modeling.inference_models.gptq_hf_torch.class.model_backend object at 0x7f543f6b2130> 2023-08-28T23:04:34.578179815+02:00 2023-08-28T23:04:34.578180982+02:00 File "/opt/koboldai/modeling/inference_models/gptq_hf_torch/class.py", line 392, in _get_model 2023-08-28T23:04:34.578182526+02:00 model = AutoGPTQForCausalLM.from_quantized(location, model_basename=Path(gptq_file).stem, use_safetensors=gptq_file.endswith(".safetensors"), device_map=device_map, inject_fused_attention=False) 2023-08-28T23:04:34.578184667+02:00 │ │ │ │ │ │ │ └ {'model.layers.0': 0, 'model.layers.1': 0, 'model.layers.2': 0, 'model.layers.3': 0, 'model.layers.4': 0, 'model.layers.5': 0... 2023-08-28T23:04:34.578185871+02:00 │ │ │ │ │ │ └ <method 'endswith' of 'str' objects> 2023-08-28T23:04:34.578187264+02:00 │ │ │ │ │ └ 'models/TehVenom_Pygmalion-7b-4bit-GPTQ-Safetensors/Pygmalion-7B-GPTQ-4bit.act-order.safetensors' 2023-08-28T23:04:34.578188433+02:00 │ │ │ │ └ 'models/TehVenom_Pygmalion-7b-4bit-GPTQ-Safetensors/Pygmalion-7B-GPTQ-4bit.act-order.safetensors' 2023-08-28T23:04:34.578189572+02:00 │ │ │ └ <class 'pathlib.Path'> 2023-08-28T23:04:34.578190775+02:00 │ │ └ 'models/TehVenom_Pygmalion-7b-4bit-GPTQ-Safetensors' 2023-08-28T23:04:34.578191927+02:00 │ └ <classmethod object at 0x7f53c847efd0> 2023-08-28T23:04:34.578194231+02:00 └ <class 'auto_gptq.modeling.auto.AutoGPTQForCausalLM'> 2023-08-28T23:04:34.578195447+02:00 2023-08-28T23:04:34.578196541+02:00 File "/opt/koboldai/runtime/envs/koboldai/lib/python3.8/site-packages/auto_gptq/modeling/auto.py", line 108, in from_quantized 2023-08-28T23:04:34.578197683+02:00 return quant_func( 2023-08-28T23:04:34.578198884+02:00 └ <bound method BaseGPTQForCausalLM.from_quantized of <class 'auto_gptq.modeling.llama.LlamaGPTQForCausalLM'>> 2023-08-28T23:04:34.578200011+02:00 File "/opt/koboldai/runtime/envs/koboldai/lib/python3.8/site-packages/auto_gptq/modeling/_base.py", line 757, in from_quantized 2023-08-28T23:04:34.578201126+02:00 quantize_config = BaseQuantizeConfig.from_pretrained(model_name_or_path, cached_file_kwargs, kwargs) 2023-08-28T23:04:34.578202302+02:00 │ │ │ │ └ {} 2023-08-28T23:04:34.578203428+02:00 │ │ │ └ {'cache_dir': None, 'force_download': False, 'proxies': None, 'resume_download': False, 'local_files_only': False, 'useauth... 2023-08-28T23:04:34.578204560+02:00 │ │ └ 'models/TehVenom_Pygmalion-7b-4bit-GPTQ-Safetensors' 2023-08-28T23:04:34.578205729+02:00 │ └ <classmethod object at 0x7f53ed2c8be0> 2023-08-28T23:04:34.578206890+02:00 └ <class 'auto_gptq.modeling._base.BaseQuantizeConfig'> 2023-08-28T23:04:34.578207977+02:00 File "/opt/koboldai/runtime/envs/koboldai/lib/python3.8/site-packages/auto_gptq/modeling/_base.py", line 93, in from_pretrained 2023-08-28T23:04:34.578209155+02:00 with open(resolved_config_file, "r", encoding="utf-8") as f: 2023-08-28T23:04:34.578210551+02:00 └ 'models/TehVenom_Pygmalion-7b-4bit-GPTQ-Safetensors/quantize_config.json' 2023-08-28T23:04:34.578211717+02:00 2023-08-28T23:04:34.578212846+02:00 FileNotFoundError: [Errno 2] No such file or directory: 'models/TehVenom_Pygmalion-7b-4bit-GPTQ-Safetensors/quantize_config.json'

henk717 commented 1 year ago

Known issue with this model since its super old. If occam's GPTQ module gets updated to work on newer Huggingface versions we can continue to support it, otherwise I suggest you to find a newer conversion of the model that is compatible with AutoGPTQ or to instead load the 16-bit version so KoboldAI quantizes it for you.

skaba04 commented 1 year ago

Ok so if nothing works and I cant seem to find the AutoGPTQ version. Only thing that left me is just wait.

skaba04 commented 1 year ago

But it says that it is quantanized with GPTQ-for-LLaMa is it any different?

henk717 commented 1 year ago

Yes because it lacks the config file that our AutoGPTQ fallback needs. I cant legally tell you where to get reuploads but uploads compatible with the current KoboldAI United versions do exist so you dont have to wait.

skaba04 commented 1 year ago

Ok so where those uploads compatible with the current KoboldAI United versions are?