Open Khazmadoz opened 1 year ago
Hi @Khazmadoz, is this a custom tokenizer? It seems odd that it would have a pad_token
but not a pad_token_id
which is just its numerical form, would you be able to run the code below and paste your output:
tokenizer = AutoTokenizer.from_pretrained(PATH_TO_YOUR_TOKENIZER)
print(tokenizer.all_special_tokens)
print(tokenizer.all_special_ids)
Hey @cdpierse,
the script I use was one of my colleagues who got it from another guy and I came to the conclusion that it was just not nicely coded. I adapted another example of twitter analysis using a multiclass model using an AutoTokenizer
. Now it works nicely 😄 Thank you, for your help anyway. However, I wondered if there was a solution to display all the words in a dataset the fine-tuned model used for a decision in favor of a chosen class (I think in lime it’s kind of a tree). I don’t know, if you already implemented something like that, I will deep dive into the documentation to see, if there is something. Else, it would be a very nice feature to add.😊
Hey there,
I'm trying to use your transofmers-interpret package, while using a bert-base-uncased model with the corresponding tokenizer. I'm loading the tokenizer using:
tokenizer = BertTokenizer.from_pretrained( BERT_USED_MODEL_PATH, do_lower_case=True )
When I use the interpreter, using:
from transformers_interpret import SequenceClassificationExplainer cls_explainer = SequenceClassificationExplainer(model=model, tokenizer=tokenizer)
I get an error:
Traceback (most recent call last): File "/home/khazmadoz/ML_GG/main.py", line 232, in <module> cls_explainer = SequenceClassificationExplainer(model=model, File "/home/khazmadoz/anaconda3/lib/python3.9/site-packages/transformers_interpret/explainers/text/sequence_classification.py", line 53, in __init__ super().__init__(model, tokenizer) File "/home/khazmadoz/anaconda3/lib/python3.9/site-packages/transformers_interpret/explainer.py", line 22, in __init__ self.ref_token_id = self.tokenizer.pad_token_id AttributeError: 'BertTokenizer' object has no attribute 'pad_token_id'
Do you know how to fix this? The tokenizer has an attribute 'pad_token' but no 'pad_token_id'.
Thank you, David