Open Bachstelze opened 3 weeks ago
Hi @Bachstelze, thanks for raising an issue!
This is a question best placed in our forums. We try to reserve the github issues for feature requests and bug reports.
Though it isn't possible to evaluate them properly: https://github.com/huggingface/transformers/issues/28721
This isn't quite right - it's not possible to load them through the AutoModelForCausalLM
API and hence submit to the open LLM leaderboard. It can still be done manually. If the decoder is loaded with AutoModelForCausalLM
, which is done by default, you already have the task specific head. To evaluate the model, you then just need inputs, labels and a metric.
Model description
"Attention Is All You Need" is a landmark 2017 research paper authored by eight scientists working at Google, responsible for expanding 2014 attention mechanisms proposed by Bahdanau et al. into a new deep learning architecture known as the transformer with an encoder, cross-attention, and a decoder.
Open source status
Provide useful links for the implementation
EncoderDecoderModels are supported via the huggingface API. Though it isn't possible to evaluate them properly: https://github.com/huggingface/transformers/issues/28721 How is it possible to build and evaluate a vanilla transformer with an encoder, cross-attention, and a decoder in huggingface?