Open sadra-barikbin opened 7 months ago
Hey,
For decoder-only and pre_encode_input=True case, the
tokenized_contexts=batch_inputs
argument to custom module calls lacks the last token of the context which has been prepended to the output. Is this really desired? Couldn't be confusing for the user?
Yes, when using pre_encode_input=True
with decoder-only models, the input is first given to the LLM and the past_key_values
is obtained. However, transformers only returns the hidden_states
for inputs, not for the past_key_values
. So to get the hidden states from the input's last token, this token should be removed from the inputs used when pre-encoding.
I agree that this may be confusing. This is among a long list of things that aren't documented. I unfortunately have a very limited bandwith to work on improving the documentation :/
Why don't we use
bos_token
here while creating decoder input in encoder-decoder setup? Thebos_token
is not provided for all tokens. When I first implemented thehf_llm.py
, I was mostly using T5 models which do not have anybos_token
. This said, I had to force the pad token to 0 as apad_token
is also not implemented for all models... I don't know if there is any clean universal solution. Nevertheless, it seems the token 0 is often used for padding.
- [ ] To batchify generation:
Yes, this needs to be done :)
- [ ] We could add
output_hidden_states
not as a tensor:
Right. We would have to check the transformers API handles a boolean.
- [ ] When
pretrained=False
anduse_cpu=False
,HF_LLM.__init__
raises error in:
I need to test this. I am not sure when I can do it though.
Hi there! I had a few questions & suggestions regarding
HF_LLM.py
. Sorry in advance for being naive.tokenized_contexts=batch_inputs
argument to custom module calls lacks the last token of the context which has been prepended to the output. Is this really desired? Couldn't be confusing for the user? https://github.com/flowersteam/lamorel/blob/c82d1b17d275ef40f5d5c23dafde0e56ef01e73e/lamorel/src/lamorel/server/llms/hf_llm.py#L346-L352bos_token
here while creating decoder input in encoder-decoder setup? https://github.com/flowersteam/lamorel/blob/c82d1b17d275ef40f5d5c23dafde0e56ef01e73e/lamorel/src/lamorel/server/llms/hf_llm.py#L167output_hidden_states
not as a tensor: https://github.com/flowersteam/lamorel/blob/c82d1b17d275ef40f5d5c23dafde0e56ef01e73e/lamorel/src/lamorel/server/llms/hf_llm.py#L327-L337pretrained=False
anduse_cpu=False
,HF_LLM.__init__
raises error in: https://github.com/flowersteam/lamorel/blob/c82d1b17d275ef40f5d5c23dafde0e56ef01e73e/lamorel/src/lamorel/server/llms/hf_llm.py#L61 Since it seems that we could not give adhoc model config parameters tomodel_cls.from_config
, but tomodel_cls.from_pretrained
or we could have insteadconfig = config_cls.from_pretrained(path, **adhoc);model_cls.from_config(config)
. To reproduce: