Closed gante closed 1 year ago
cc @patrickvonplaten
I like the idea of using a use_config_defaults
a lot - think that's a great additional safety mechanism to ensure it's possible to keep backward compatibility.
Also we were thinking about the idea of having a generation_config.json
file that can optionally be passed to generate
by the user and that includes all the default values that are set in the config at the moment. This would also make it easier to possible have multiple different generation configs.
Some models like bart-large
: https://huggingface.co/facebook/bart-large/blob/main/config.json#L45 always have certain generation parameters enabled by default and IMO it would be a smoother transition to help the user extract a generation_config.json
from config.json
and then always pass this config if present in the repo to generate(...)
instead of forcing the user to always pass all those arguments to generate.
With the config, we could do something like the following automatically:
generate
. We detect that no generation_config.json
is present and that default generation params are used from config.json
generate_config.json
fileconfig.json
though to keep backwards compatibility with use_config_defaults
generation_config.py
is present we always use this and do not look into the configmax_length=20
because it's always set and we don't want to create a generation_config.py
for all models Also happy to jump on a call to brainstorm about this a bit!
Fair point! 👍
From the comment above, let's consider the updated requirements:
v5
, the default behavior can’t change, i.e., we will use the model config.json
as a source of defaults;v5
onwards, the default behavior is to use generate_config.json
as a source of defaults;generate
.A solution that fits all requirements is the ability to specify where the defaults should be loaded from, with default paths controlled by us. With the aid of a script to create the new generation config file from the existing model config file, the transition should be smooth and users can anticipate any change.
E.g. if we have a generation_config_file
flag, defaulting to None
and where a path in the model repo can be specified, then we could:
generation_config_file="config.json"
, which would mimic the existing default behavior (and would be the default behavior for now);generation_config_file="generation_config.json"
, which would use the new config file for generation (which would be the default behavior in the future);generation_config_file
to ANY generation config path, so that power users can have multiple configurations for the same model;generation_config_file=False
(or other non-None
falsy value) to not use any configuration at all.We seem to need two warnings ⚠️ :
v5
we will be defaulting to a new config file, which may not exist in a user's model repo, and the model may have generation parameters in its config] If the configuration file does not exist, fall back to config.json
and warn about it. We can quickly scan config.json
to avoid raising a warning if it doesn't contain any generation argument;generation_config_file
is not specifically set by the user, a warning should be raised if the config replaces any argument. Many configs don't replace any value.Both warnings can be avoided by specifying the generation_config_file
argument. They may be a bit verbose, but I think verbosity (which can be shut down easily) is preferable to magic confusing behavior.
The max_length=20
default (and other similar defaults) can be easily added -- max_length = max_length if max_length is not None else 20
after attempting to load the configs. We can add them to the argument's documentation (see below).
🤔 The only issue I have with this approach is that it is hell to document (similar to the current approach). Having "this argument defaults to X or to config.argument
" for all arguments' documentation line is verbose and confusing, and users need to be aware that the configuration files play an important role.
My suggestion here would be to make generation_config_file
the second argument of generate
(after input_ids
), so that it becomes immediately clear that generate
argument defaults can be set through a file. Then, I would remove further references to the config in the docs, relying on the warnings to remind the user of what's going on. I think it is clear by now that long docs don't avoid simple issues :(
WDYT?
(P.S.: will edit the issue after we settle on an approach :) )
Cool, I think this is going into a very nice direction! A couple more questions to think about:
generate_config_file
keyword argument for generate(...)
? For me it would be much more intuitive to just have config: Optional[Dict]
as an argument. This would then mean it requires the user to do one more step for a specific config:generate_config = # load generation config from path
model.generate(input_ids, config=generate_config)
We could add a config
argument to the init of GenerationMixin
which would make backwards compatibility then very easy:
from_pretrained(...)
would load either a generation_config.json
or if not present a config.json
and then set it as self.generation_config = config
=> then every generation model would have access to self.generation_config
. In generate
would could then add a self.generate_config = config if config is not None else self.generate_config (the default one)
and then overwrite self.generate_config
once more with if the user passes generate args into generate
directly (e.g. model.generate(top_k=top_k)`model.generate(..., generate_config="config.json")
would have to load a config which opens too many problems with internet connection etc....What type should generation_config
be? Just a dict
or let's maybe create a class for it (similar to BloomConfig
). Creating its own class probably also helps with documentation
-> What do you think?
@patrickvonplaten Agreed, the argument name is a bit too long 😅 However, if we decide to go the GenerationMixin.__init__
route, we can't pick config
-- PreTrainedModel
, which inherits from GenerationMixin
, uses a config
argument for the model config. Perhaps generation_config
? We could then do .from_pretrained(foo, generation_config=bar)
.
I love the ideas you gave around the config:
__init__
and if we always attempt to load the new file format before falling back to the original config, it actually means we don't need to do a major release to build the final version of this updated configuration handling! No need to change defaults with a new release at all ❤️ ;model.generate(top_k=top_k)
and then model.generate(temperature=temperature)
, top_k
should be the original config's top_k
. Copies of objects are needed;Regarding dict
vs class
-- I'd go with class
(or perhaps a simpler dataclass
). Much easier to document and enforce correctness, e.g. check if the right arguments are being used with a certain generation type.
It seems like we are in agreement. Are there more issues we can anticipate?
Very nice summary @gante thanks for writing this all down - I agree with all the above point!
@LysandreJik @sgugger and maybe @thomwolf could you take a quick look here? I think @gante and I have now an actionable plan for generate()
and would be ready to open a PR.
Before starting the PR, it would be nice if you could check if you generally agree with our comments here so that we're not totally on a different page before opening such a big PR. The PR will then take some time and require discussion, but I think we have a clear vision of what we want now
@patrickvonplaten @LysandreJik @sgugger @thomwolf -- I took the liberty of updating the issue at the top with the plan that originated from the discussion here (and also to structure the whole thing in my head better) :)
Thanks for the write-up! I think this is a much welcome change that will tremendously improve the way we use generate
.
Writing down some thoughts below.
generation_config
sounds more explicit. generate
is very understandable by us because we know what is the "generate method", but "generation config" sounds so much clearer to me than "generate config".bart-large-cnn
in t5-large
? Would we enforce model-specificity, or would we allow this to work? How would we handle model-specific attributes (maybe there aren't any, there seems to be only RAG that has its own generate
method)?The biggest work here will likely be education & documentation. I think this will already make things much clearer, but I suppose the much awaited generate method doc rework will be an absolute requirement after this refactor!
Agreed, the biggest issue is and will be education and documentation. Hopefully, this will make the process easier 🙏
generate
that are used with a limited number of (encoder-decoder) models, forced_bos_token_id
and forced_eos_token_id
. Additionally, there is one argument, encoder_no_repeat_ngram_size
, that is only used in encoder-decoder models (and have no effect on decoder-only). The remaining 36 arguments are usable by all models. IMO, having a single class would make documentation (the key issue) much simpler, and model<>arguments verification can be done in the function (as it is done in the present).generation_config
files. But @LysandreJik raised a good point, as the name of the files containing the defaults for different tasks may not be immediately obvious to the users, which implies more documentation pain. Perhaps we can take the chance here to approximate generate
to pipeline
, which we know is user-friendly -- in the pipeline
, specific config parameters are loaded for each task (here's an example of a config with task-specific parameters). We could use the exact same structure with the new generation_config
files, where all task-specific arguments can be set this way, and generate
could gain a new task
argument. That way, there would be a higher incentive to set task-specific parameters that would work across the HF ecosystem (generate
and pipeline
for now, perhaps more in the future).# with the `task` parameter, it is trivial to share the parameters for some desired behavior
# When loading the model, the existence of task-specific options would be logged to the user.
model = AutoModelForSeq2SeqLM.from_pretrained("...")
input_prompt = ...
task_tokens = model.generate(**input_prompt, task="my_task")
# There would be an exception if `my_task` is not specified in the generation config file.
The plan looks good to me, but the devil will be in the details ;-) Looking forward to the PRs actioning this!
Closing -- generation_config
is now the source of defaults for .generate()
EDIT: Updated with the discussion up to 2022/08/20
Why?
A confusing part of
generate
is how the defaults are set. When a certain argument is not specified, we attempt to fetch it from the modelconfig
file. This makesgenerate
unpredictable and hard to fully document (the default values change for each model), as well as a major source of issues :hocho:How?
We have the following requirements: 1️⃣ The existing behavior can’t be removed, i.e., we must be able to use the model
config.json
as a source of generation parameters by default; 2️⃣ We do need per-model defaults -- some models are designed to do a certain thing (e.g. summarization), which requires a specific generation configuration. 3️⃣ Users must have full control over generate, with minimal hidden behavior.Ideally, we also want to: 4️⃣ Have separation of concerns and use a new
generate_config.json
to parameterize generation;A TL;DR of the plan consists in changing the paradigm from “non-specified
generate
arguments are overridden by the [model] configuration file” to “generate
arguments will override the [generate] configuration file, which is always used”. With proper documentation changes and logging/warnings, the user will be aware of what's being set forgenerate
.Step 1: Define a new generate config file and class
Similar to the model config, we want a
.json
file to store the generation defaults. The class itself can be a very simplified version ofPretrainedConfig
, also with functionality to load/store from the hub.Step 2: Integrate loading generate config file in
.from_pretrained()
The generation configuration file should be loaded when initializing the model with a
from_pretrained()
method. A couple of things to keep in mind:kwarg
infrom_pretrained
,generate_config
(orgeneration_config
? Leaning toward the former as it has the same name as the function);generate_config.json
(contrarily to the modelconfig
, which defaults toNone
). This will allow users to set this argument toNone
, to load a model with an empty generate config. Some users have requested a feature like this;generate
will attempt to load it;generate_config.json
in the repo, it will attempt to initialize the generate configuration from the modelconfig.json
. This means that this solution will not change anygenerate
behavior and will NOT need a major release 👼from_pretrained()
time, logging will only happen at most once and will not be verbose.Step 3: Generate uses the generate config class internally
Instead of using the configuration to override arguments when they are not set, overwrite a copy of the generation config at
generate
time. I.e. instead of:do
This change has three main benefits:
generate
argument validation for each type of generation can be built in simple functions that don't need ~30 arguments as input 🙃Step 4: Document and open PRs with the generation config file
Rewrite part of the documentation to explain that a generation config is ALWAYS used (regardless of having defaults loaded from the hub or not). Open Hub PRs to pull generate-specific parameters from
config.json
togenerate_config.json
Pros/Cons
Pros:
generate
default will be logged to the screen when loading a generate-compatible model;generate
code;generate
-related code across frameworks;Cons: