Open Bachstelze opened 9 months ago
Hi @Bachstelze, thanks for raising an issue!
The EncoderDecoder
models are composite models which use AutoModel
to load the encoder and decoder respectively. As per the BertGeneration docs, you can load the model using:
from transformers import EncoderDecoderModel
model = EncoderDecoderModel.from_pretrained("Bachstelze/instructionRoberta-base", output_attentions=True)
@amyeroberts Yes it is possible to load it as EncoderDecoderModel, though many libraries load generic models just with the Automodel, so EncoderDecoderModels yield an error.
This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.
Please note that issues that do not follow the contributing guidelines are likely to be ignored.
Given that the EncoderDecoderModel already uses the AutoModel internally, it should be possible to configure it also as AutoModel. Or isn't it possible? @amyeroberts
Hi @Bachstelze, it doesn't make sense to load this way. AutoModel is used to load individual models defined in MODEL_MAPPING_NAMES
in modeling_auto.py
. EncoderDecoder is a composite model, allowing you to combine encoder and decoder models but isn't a defined model within our library e.g. like BERT.
This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.
Please note that issues that do not follow the contributing guidelines are likely to be ignored.
@amyeroberts Isn't it possible to add the the EncoderDecoder into modeling_auto.py
? Otherwise, we have to rewrite all the other libraries that use huggingface to support this model type.
@Bachstelze If there was an addition of the model to modeling_auto.py, this would mean adding a class AutoEncoderDecoder
class - which would be the same in terms of code as using the EncoderDeocderModel
class.
Perhaps you can elaborate a bit more on why this is needed? One thing that might be useful to know is that the architecture to load the model can be found in the model's config.
@amyeroberts thanks for the reply!
Many external libraries just use the AutoModel
to initialize the model class. They all need to be extended to support the EncoderDeocderModel
or we allow the loading as AutoModel
once in the huggingface lib.
@amyeroberts One concrete example is 🤗 Open LLM Leaderboard. It requires the model to be loaded as AutoModel: "Make sure you can load your model and tokenizer using AutoClasses" How can this be achieved or should the leaderboard be extended?
@Bachstelze OK, I see. Thanks for providing an example.
So, it doesn't make sense to load a custom encoder-decoder with the AutoModel
API, as the default model created is a causal LM. This is actually a good thing regarding the leaderboard - as it uses AutoModelForCausalLM
to load the models (rather than AutoModel
).
However, we still shouldn't put the encoder-decoder model into the automapping because it's ultimately too flexible. It's possible to pass in a decoder with any task head that you want. Models which can be loaded in an auto class e.g. AutoModelForCausalLM should all perform the same task, take the same inputs and return the same outputs.
If you'd like to have your model evaluated according to the Open LLM leaderboard, you can use the lighteval library to get the same results.
@amyeroberts This seems theoretical possible, yet the lighteval library is still unstable and buggy https://github.com/huggingface/lighteval/issues/183
If you need to evaluate a model following the same steps as the leaderboard, you can actually use the lm_eval harness from Eleuther, following the steps in the Reproducibility
section of the About
tab. lighteval
is still an alpha, so we don't guarantee exhaustiveness or stability for now, but we're doing our best to fix issues as they arrive.
@clefourrier which "steps in the Reproducibility section of the About tab" do you mean? I couldn't run lm-evaluation-harness with the description in the ReadMe.
@ How can an encoder-decoder-model be loaded as AutoModelForCausalLM or AutoModelForSeq2SeqLM?
If you'd like to have your model evaluated according to the Open LLM leaderboard, ...
If you go on the Open LLM Leaderboard page, there is a tab called "About", which gives all the steps to allow for reproduction of the results (for AutoModelForCausalLM
models only, however).
System Info
transformers
version: 4.35.0Who can help?
@ArthurZucker and @younesbelkada
Information
Tasks
examples
folder (such as GLUE/SQuAD, ...)Reproduction
from transformers import AutoTokenizer, AutoModel
tokenizer = AutoTokenizer.from_pretrained("Bachstelze/instructionRoberta-base") model = AutoModel.from_pretrained("Bachstelze/instructionRoberta-base", output_attentions=True)
Expected behavior
Load the EncoderDecoderModel as AutoModel. "BertGenerationConfig" is supported, though this seems outdated.