Open aohenuo opened 7 months ago
@aohenuo Hello, I came across a similar question to yours and was looking into potential solutions. From examining the code, it seems the issue might relate to the hidden states produced by each layer. You can try setting the "output_hidden_states" attribute in the generation configuration to "True". This should ensure that the output includes these hidden states. However, I'm not very certain, could you please share some insights?
Thank you very much for your work! I have a question about the integration of the paper and the code. In your paper, where exactly is the feature vector inserted into the language model? Is it in the MLP layer of the language model? If it is in the MLP layer, where exactly is it inserted? Is it in the gate linear of the MLP layer, in the up linear, or in the down linear after the output is complete?