Implement documented generation config

Feature request

Implement the generation config described here: https://huggingface.co/docs/transformers.js/en/api/utils/generation#utilsgenerationgenerationconfigtype--code-object-code

Motivation

I spent some time experimenting with the library and noticed that a lot of the functionality described in the documentation is not implemented yet. For example, the output_scores flag has no effect right now (this is something I need for a project I'm working on). Another example is the return_dict_in_generate which causes an error in the tokenization step of the text generation pipeline. Looking at the code, it seems like it is a WIP as there are a lot of TODO comments here and there. My understanding is ideally the implementation should be as close to the original Python one in the main Transformers library as possible.

Your contribution

I would love to help with the implementation. I'm wondering how I can best do it. For example, the ModelOutput class is present but isn't currently used in the main generate function. Should I focus on trying to implement it which would take some time and bigger PRs, or should I try to add the missing functionality first step by step and in a dirty way (as it's currently done) even if it means a rewrite in the future?

PS Thanks for the great work. I really like this library and I'm looking forward to contributing :)

xenova / transformers.js