Closed GewelsJI closed 2 months ago
Hi Daniel,
Bunny calls forward
of Phi here, so I think you can use a variable to save the last hidden state here and return it by add an item in CausalLMOutputWithPast
.
And I also think output_ids['hidden_states'][-1]
is what you need.
But this last_hidden_states
has been layer normalized here.
Hi, @Isaachhh
Appreciated your suggestions here.
Actually, I print the shape of all output_ids['hidden_states']
, casue it is a tuple. During the casual-decoder inference, they will generate 1024 items, and the first item output_ids['hidden_states'][0]
is with shape of (bs, token-nums, 2048)
, and the remainings are consistently with shape of (bs, 1, 2048)
. I assume the reason is the casual-decoder-only framework works like a token-by-token prediction, is that right?
Further, I have no idea why the length of output_ids['hidden_states']
is 1024?
Best, Daniel
I see. A nice discussion with you. Thank you again.
Hi, Bunny team,
Thanks for providing such a nice project.
I am trying to extract the last hidden state of bunny-phi, and I add an extra parameter here, like this:
Next, I receive a bunch of tensors:
output_ids['hidden_states']
is a tuple, and its length is 1024. I am not sure and do not know how to extract the last hidden state from Phi, ie., this tensor beforelm_head
.Thanks, Daniel