RuntimeError: weight model.layers.24.mlp.gate_proj.weight does not exist

jackielii commented 1 year ago

System Info

cargo 1.72.0 (103a7ff2e 2023-08-15)

text-generation-launcher --env :

2023-09-12T22:39:32.121302Z  INFO text_generation_launcher: Runtime environment:                                                                                                                                                                                                                                            
Target: x86_64-unknown-linux-gnu                                                                                                                                                                                                                                                                                            
Cargo version: 1.70.0                                                                                                                                                                                                                                                                                                       
Commit sha: 4cce84301b9f0f3472cf8b54e3119afd87caa09e                                                                                                                                                                                                                                                                        
Docker label: N/A                                                                                                                                                                                                                                                                                                           
nvidia-smi:                                                                                                                                                                                                                                                                                                                 
Tue Sep 12 22:39:30 2023                                                                                                                                                                                                                                                                                                    
   +-----------------------------------------------------------------------------+                                                                                                                                                                                                                                          
   | NVIDIA-SMI 470.82.01    Driver Version: 470.82.01    CUDA Version: 11.8     |                                                                                                                                                                                                                                          
   |-------------------------------+----------------------+----------------------+                                                                                                                                                                                                                                          
   | GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |                                                                                                                                                                                                                                          
   | Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |                                                                                                                                                                                                                                          
   |                               |                      |               MIG M. |                                                                                                                                                                                                                                          
   |===============================+======================+======================|                                                                                                                                                                                                                                          
   |   0  NVIDIA A100-SXM...  Off  | 00000000:07:00.0 Off |                    0 |                                                                                                                                                                                                                                          
   | N/A   29C    P0    55W / 400W |      0MiB / 40536MiB |      0%      Default |                                                                                                                                                                                                                                          
   |                               |                      |             Disabled |                                                                                                                                                                                                                                          
   +-------------------------------+----------------------+----------------------+                                                                                                                                                                                                                                          
   |   1  NVIDIA A100-SXM...  Off  | 00000000:0F:00.0 Off |                    0 |                                                                                                                                                                                                                                          
   | N/A   25C    P0    53W / 400W |      0MiB / 40536MiB |      0%      Default |                                                                                                                                                                                                                                          
   |                               |                      |             Disabled |                                                                                                                                                                                                                                          
   +-------------------------------+----------------------+----------------------+                                                                                                                                                                                                                                          
   |   2  NVIDIA A100-SXM...  Off  | 00000000:47:00.0 Off |                    0 |                                                                                                                                                                                                                                          
   | N/A   24C    P0    46W / 400W |      0MiB / 40536MiB |      0%      Default |                                                                                                                                                                                                                                          
   |                               |                      |             Disabled |                                                                                                                                                                                                                                          
   +-------------------------------+----------------------+----------------------+                                                                                                                                                                                                                                          
   |   3  NVIDIA A100-SXM...  Off  | 00000000:4E:00.0 Off |                    0 |                                                                                                                                                                                                                                          
   | N/A   26C    P0    66W / 400W |      0MiB / 40536MiB |      0%      Default |                                                                                                                                                                                                                                          
   |                               |                      |             Disabled |                                                                                                                                                                                                                                          
   +-------------------------------+----------------------+----------------------+                                                                                                                                                                                                                                          
   |   4  NVIDIA A100-SXM...  Off  | 00000000:87:00.0 Off |                    0 |                                                                                                                                                                                                                                          
   | N/A   36C    P0    76W / 400W |  38121MiB / 40536MiB |      0%      Default |                                                                                                                                                                                                                                          
   |                               |                      |             Disabled |                                                                                                                                                                                                                                          
   +-------------------------------+----------------------+----------------------+                                                                                                                                                                                                                                          
   |   5  NVIDIA A100-SXM...  Off  | 00000000:90:00.0 Off |                    0 |                                                                                                                                                                                                                                          
   | N/A   33C    P0    76W / 400W |  38049MiB / 40536MiB |      0%      Default |                                                                                                                                                                                                                                          
   |                               |                      |             Disabled |                                                                                                                                                                                                                                          
   +-------------------------------+----------------------+----------------------+                                                                                                                                                                                                                                          
   |   6  NVIDIA A100-SXM...  Off  | 00000000:B7:00.0 Off |                    0 |                                                                                                                                                                                                                                          
   | N/A   32C    P0    79W / 400W |  38109MiB / 40536MiB |      0%      Default |                                                                                                                                                                                                                                          
   |                               |                      |             Disabled |                                                                                                                                                                                                                                          
   +-------------------------------+----------------------+----------------------+                                                                                                                                                                                                                                          
   |   7  NVIDIA A100-SXM...  Off  | 00000000:BD:00.0 Off |                    0 |                                                                                                                                                                                                                                          
   | N/A   33C    P0    78W / 400W |  38085MiB / 40536MiB |      0%      Default |                                                                                                                                                                                                                                          
   |                               |                      |             Disabled |                                                                                                                                                                                                                                          
   +-------------------------------+----------------------+----------------------+                                                                                                                                                                                                                                          

   +-----------------------------------------------------------------------------+                                                                                                                                                                                                                                          
   | Processes:                                                                  |                                                                                                                                                                                                                                          
   |  GPU   GI   CI        PID   Type   Process name                  GPU Memory |                                                                                                                                                                                                                                          
   |        ID   ID                                                   Usage      |                                                                            
   |=============================================================================|                                                                            
   +-----------------------------------------------------------------------------+                                                                            
2023-09-12T22:39:32.121389Z  INFO text_generation_launcher: Args { model_id: "/workspace/jackie/codellama-34B/phind-codellama-34b-v2-hr", revision: None, validation_workers: 2, sharded: Some(false), num_shard: None, quantize: None, dtype: None, trust_remote_code: false, max_concurrent_requests: 128, max_best_of: 2,
 max_stop_sequences: 4, max_top_n_tokens: 5, max_input_length: 1024, max_total_tokens: 2048, waiting_served_ratio: 1.2, max_batch_prefill_tokens: 4096, max_batch_total_tokens: None, max_waiting_tokens: 20, hostname: "llm-test", port: 3001, shard_uds_path: "/tmp/text-generation-server", master_addr: "localhost", mas
ter_port: 27960, huggingface_hub_cache: None, weights_cache_override: None, disable_custom_kernels: false, cuda_memory_fraction: 1.0, rope_scaling: None, rope_factor: None, json_output: false, otlp_endpoint: None, cors_allow_origin: [], watermark_gamma: None, watermark_delta: None, ngrok: false, ngrok_authtoken: No
ne, ngrok_edge: None, env: true }

Information

[ ] Docker
[X] The CLI directly

Tasks

[X] An officially supported command
[ ] My own modifications

Reproduction

I'd like to use my GPTQ fine tuned model using example script referenced in the HF GPTQ Integration blog post. To run this script, I need to use latest transfermers 0.4.33 . However after training and merging the adapter into base model, then loaded into TGI, I get error:

RuntimeError: weight model.layers.24.mlp.gate_proj.weight does not exist

After a bit googling, I found out that it's probably the transformer version mismatch: https://huggingface.co/jondurbin/airoboros-l2-70b-gpt4-1.4.1/discussions/3#64cc1b4ba257a3212c0e473b

I'm not sure that's the reason.

As said above, the reproducing step is:

run example script
use merge.py (see below) to get merged model
launch TGI with modle-id point to the merged model

merge.py

from transformers import AutoModelForCausalLM
from peft import PeftModel

base_model = AutoModelForCausalLM.from_pretrained("Phind/Phind-CodeLlama-34B-v2", device_map="auto")
peft_model_id = "./ckp-python-hr/final_checkpoints/"
model = PeftModel.from_pretrained(base_model, peft_model_id)

merged_model = model.merge_and_unload()
output_dir = "./phind-codellama-34b-v2-hr/"
merged_model.save_pretrained(output_dir)

Expected behavior

Merged model loaded correctly in TGI

Benvii commented 1 year ago

We are experiencing the same issue with our Bloomz model http://hf.co/cmarkea/bloomz-7b1-mt-sft-chat this model is also chunked.


  File "/opt/conda/lib/python3.9/site-packages/text_generation_server/models/custom_modeling/bloom_modeling.py", line 609, in __init__
    self.word_embeddings = TensorParallelEmbedding(

  File "/opt/conda/lib/python3.9/site-packages/text_generation_server/utils/layers.py", line 306, in __init__
    weight = weights.get_partial_sharded(f"{prefix}.weight", dim=0)

  File "/opt/conda/lib/python3.9/site-packages/text_generation_server/utils/weights.py", line 76, in get_partial_sharded
    filename, tensor_name = self.get_filename(tensor_name)

  File "/opt/conda/lib/python3.9/site-packages/text_generation_server/utils/weights.py", line 52, in get_filename
    raise RuntimeError(f"weight {tensor_name} does not exist")

RuntimeError: weight word_embeddings.weight does not exist

Narsil commented 1 year ago

@jackielii Didn't you forget --quantize gptq since you seem to be loading a gptq model given your issue ?

@Benvii The model you are linking was saved with a transformer. prefix in the model which we don't support for now.

github-actions[bot] commented 8 months ago

This issue is stale because it has been open 30 days with no activity. Remove stale label or comment or this will be closed in 5 days.

huggingface / text-generation-inference