ndif-team / nnsight

The nnsight package enables interpreting and manipulating the internals of deep learned models.
https://nnsight.net/
MIT License
356 stars 34 forks source link

Cannot specify custom tokenizer with UnifiedTransformer #202

Open Butanium opened 3 weeks ago

Butanium commented 3 weeks ago
!pip install -q git+https://github.com/ndif-team/nnsight@0.3 transformer_lens
from nnsight.models.UnifiedTransformer import UnifiedTransformer
from transformers import AutoTokenizer
tokenizer = AutoTokenizer.from_pretrained("gpt2")
model = UnifiedTransformer("gpt2", tokenizer=tokenizer, processing=False)

---> 43         super().__init__(hooked_model, tokenizer=self.tokenizer, *args, **kwargs)
     44 
     45     def _prepare_inputs(

TypeError: nnsight.models.LanguageModel.LanguageModel.__init__() got multiple values for keyword argument 'tokenizer'

Solution (PR incoming): check if tokenizer is in kwargs.