Closed yhznb closed 3 years ago
Hey @yhznb,
We try to mainly use the github issues for bugs in the library. For more customized questions it would be great if you could use https://discuss.huggingface.co/ instead.
Regarding your question I would just add a layer to BertLMHeadModel
wherever you want to and then build your EncoderDecoderModel
from BertModel
(encoder) & your use-case speciifc BertLMHeadModel
(decoder).
Hey, @patrickvonplaten, I have the same question. Can you provide a example of building the EncoderDecoderModel from BertModel (encoder) & use-case speciifc BertLMHeadModel ? I can't find this in the official document. Thank you very much .
I think the model(EncoderDecoderModel) outputs all the hidden states at once . And I want to control it step by step. For example , I want to change the LMhead of Decoder by concatenating another vector. The problem is that the DecoderModel outputs all the hidden states at once. I want to control it for step by step decoding. In other words. I want to use the concatenated vector as the hidden state for generation and use the generated word vector for next step's input. How can I change the model or call the interface properly ? Is it possible under the framework of huggingface ? Thank you very much ! @patrickvonplaten
I also raised this in the forum. Does this issue need to be closed ? The link is here : https://discuss.huggingface.co/t/control-encoderdecodermodel-to-generate-tokens-step-by-step/1756
thank you very much ! @patrickvonplaten
Have you solved your question ? @AI678 I think it is all about changing the LMhaed and the calculation of logits. But I don't know how to change it .
Yes , you are right. @yhznb
Hey @yhznb,
We try to mainly use the github issues for bugs in the library. For more customized questions it would be great if you could use https://discuss.huggingface.co/ instead.
Regarding your question I would just add a layer to
BertLMHeadModel
wherever you want to and then build yourEncoderDecoderModel
fromBertModel
(encoder) & your use-case speciifcBertLMHeadModel
(decoder).
Sorry, I misunderstood what you meant. This is a feature to be developed. So, how long can this feature be developed ? thank you for your response.
Hey , I have similar demands. Because I think using only vanilla bert2bert or roberta2roberta is not sufficient for abstractive summarization . For fluency and information richness, we should consider to change the top layer of decoder for further learning.
Hey, @patrickvonplaten, when do you want to release that ?
@nlpLover123 , you can control it step by step. But I think it is too slow for a large dataset like cnn-dailymail. And I also want to ask when do you want to release that ? @patrickvonplaten If that needs too much time, maybe I would write a encoder_decoder_model from scratch. Because I have little time to wait for that. Thank you very much .
that is too difficult @AI678 .Maybe it is slower that step by step generation.
so I just want to make a specific change at the LMhead layer @moonlightarc
@AI678 , I don't think we are planning on releasing such a feature into the library. It's a very specific request and I'd suggest that you try to fork the repo and make the changes according to your needs
❓ Questions & Help
Details
Hey , I use EncoderDecoderModel for abstractive summarization. I load the bert2bert model like this model=EncoderDecoderModel.from_encoder_decoder_pretrained('bert-base-uncased', 'bert-base-uncased')
And I want to make some structural changes to the output layer of decoder model.
For example, in one decoder step, the output hidden state of bert-decoder is a vector (s). I use another network and I get a vector (w) to make the summarization more accurate. I want to concatenate the two vectors in the output layer and use the final vector to generate a word in the vocabulary.
How can I do this ?
A link to original question on the forum/Stack Overflow: