Chapter 3: 'BertConfig' object has no attribute 'hidden_state'

nlp-with-transformers / notebooks

Jupyter notebooks for the Natural Language Processing with Transformers book

https://transformersbook.com/

Apache License 2.0

3.82k stars 1.18k forks source link

Chapter 3: 'BertConfig' object has no attribute 'hidden_state' #90

Open hoangminhtoan opened 1 year ago

hoangminhtoan commented 1 year ago

Information

The question or comment is about chapter:

[ ] Introduction
[ ] Text Classification
[x] Transformer Anatomy
[ ] Multilingual Named Entity Recognition
[ ] Text Generation
[ ] Summarization
[ ] Question Answering
[ ] Making Transformers Efficient in Production
[ ] Dealing with Few to No Labels
[ ] Training Transformers from Scratch
[ ] Future Directions

Question or comment

I got this error when running notebook for Chapter03 on google colab with transformers ver 4.13.0 Screenshot 2023-03-05 at 22 37 10

Screenshot 2023-03-05 at 22 37 54

AlessandroMiola commented 1 year ago

Are you sure you're not mixing hidden_size and hidden_state up? hidden_state is just the name which is given to the argument passed to the .forward method in MultiHeadAttention and it is not used as an attribute of the config object as far as I can see.

hoangminhtoan commented 1 year ago

Screenshot 2023-03-06 at 20 59 16

Are you sure you're not mixing hidden_size and hidden_state up? hidden_state is just the name which is given to the argument passed to the .forward method in MultiHeadAttention and it is not used as an attribute of the config object as far as I can see.

The hidden_state`` is called in.forward(self, hidden_state)inAttentionHeadclass. I cloned the notebook and then ran the notebook. I'll change the attributehidden_stateintohidden_size``` to check if the error occurs.

AlessandroMiola commented 1 year ago

I didn't mean to change hidden_state into hidden_size; I meant that as is (and as far as I could experiment) hidden_state is not used as an attribute of the BertConfig object (as BertConfig does not have such an attribute), but rather as an argument to the .forward methods of both AttentionHead and MultiHeadAttention classes, which shouldn't hurt.

This said, I didn't clone and run the notebook directly, therefore I might be wrong.