ndif-team / nnsight

The nnsight package enables interpreting and manipulating the internals of deep learned models.
https://nnsight.net/
MIT License
399 stars 37 forks source link

AttributeError: 'LlamaForCausalLM' object has no attribute 'transformer' when using deepseek #154

Closed g-w1 closed 4 months ago

g-w1 commented 4 months ago

I'm not sure exactly what is going on but I think it's because I'm using deepseek maybe?

With this code

model = LanguageModel("deepseek-ai/deepseek-llm-7b-base", device_map='cuda', tokenizer=tokenizer)

with model.trace("Hey here is some text"):
    print(model.transformer)
    output = model.output.save()

I get this error:

AttributeError                            Traceback (most recent call last)
Cell In[13], line 7
      3 #tokenizer = AutoTokenizer.from_pretrained("deepseek-ai/deepseek-llm-7b-base")
      4 #model = LanguageModel("redwoodresearch/math_pwd_lock_deepseek_math7b_on_weak_pythia1b", device_map='cuda', tokenizer=tokenizer)
      5 model = LanguageModel("deepseek-ai/deepseek-llm-7b-base", device_map='cuda', tokenizer=tokenizer)
----> 7 with model.trace("Hey here is some text"):
      8     print(model.transformer)
      9     output = model.output.save()

File /.../python3.12/site-packages/nnsight/contexts/Runner.py:41, in Runner.__exit__(self, exc_type, exc_val, exc_tb)
     39 """On exit, run and generate using the model whether locally or on the server."""
     40 if isinstance(exc_val, BaseException):
---> 41     raise exc_val
     43 if self.remote:
     44     self.run_server()

Cell In[13], line 8
      5 model = LanguageModel("deepseek-ai/deepseek-llm-7b-base", device_map='cuda', tokenizer=tokenizer)
      7 with model.trace("Hey here is some text"):
----> 8     print(model.transformer)
      9     output = model.output.save()

File /.../python3.12/site-packages/nnsight/models/NNsightModel.py:315, in NNsight.__getattr__(self, key)
    309 def __getattr__(self, key: Any) -> Union[Envoy, InterventionProxy, Any]:
    310     """Wrapper of ._envoy's attributes to access module's inputs and outputs.
    311 
    312     Returns:
    313         Any: Attribute.
    314     """
--> 315     return getattr(self._envoy, key)

File /.../python3.12/site-packages/nnsight/envoy.py:373, in Envoy.__getattr__(self, key)
    363 def __getattr__(self, key: str) -> Union[Envoy, Any]:
    364     """Wrapper method for underlying module's attributes.
    365 
    366     Args:
   (...)
    370         Any: Attribute.
    371     """
--> 373     return getattr(self._module, key)

File /.../python3.12/site-packages/torch/nn/modules/module.py:1688, in Module.__getattr__(self, name)
   1686     if name in modules:
   1687         return modules[name]
-> 1688 raise AttributeError(f"'{type(self).__name__}' object has no attribute '{name}'")

AttributeError: 'LlamaForCausalLM' object has no attribute 'transformer'
francescortu commented 4 months ago

If you print(model) to show which modules are there you'll obtain

LlamaForCausalLM(
  (model): LlamaModel(
    (embed_tokens): Embedding(102400, 4096)
    (layers): ModuleList(
      (0-29): 30 x LlamaDecoderLayer(
        (self_attn): LlamaSdpaAttention(
          (q_proj): Linear(in_features=4096, out_features=4096, bias=False)
          (k_proj): Linear(in_features=4096, out_features=4096, bias=False)
          (v_proj): Linear(in_features=4096, out_features=4096, bias=False)
          (o_proj): Linear(in_features=4096, out_features=4096, bias=False)
          (rotary_emb): LlamaRotaryEmbedding()
        )
        (mlp): LlamaMLP(
          (gate_proj): Linear(in_features=4096, out_features=11008, bias=False)
          (up_proj): Linear(in_features=4096, out_features=11008, bias=False)
          (down_proj): Linear(in_features=11008, out_features=4096, bias=False)
          (act_fn): SiLU()
        )
        (input_layernorm): LlamaRMSNorm()
        (post_attention_layernorm): LlamaRMSNorm()
      )
    )
    (norm): LlamaRMSNorm()
  )
  (lm_head): Linear(in_features=4096, out_features=102400, bias=False)
  (generator): WrapperModule()
)

So the the point is that transformer is not a valid module for this model.

g-w1 commented 4 months ago

Thank you! I didn't understand that the api was not homogeneous for all models. After some playing around here is what I got to work (maybe reference for future viewers of this issue):

tokenizer = AutoTokenizer.from_pretrained("deepseek-ai/deepseek-llm-7b-base")
model = LanguageModel("deepseek-ai/deepseek-llm-7b-base", device_map='cuda', tokenizer=tokenizer)

with model.trace("Hey here is some text"):
    output_layer_1_mlp = model.model.layers[0].output.save()
print(output_layer_1_mlp)