Closed yohann-benchetrit closed 1 year ago
You have successfully added a new CodeQL configuration .github/workflows/codeql.yml:build
. As part of the setup process, we have scanned this repository and found 3 existing alerts. Please check the repository Security tab to see all alerts.
High level, I notice you implemented the sampling as part of the greedy_search
function. Technically, with sampling this name is incorrect. I see a couple of ways of going about this.
Rename the function to a more generic sample
method and remove the do_sample
parameter to avoid confusion. Then, if top_k
or top_p
was specified, a true sampling would be implemented. Otherwise, it would essentially be a greedy search. (in this case, as well, top_k should probably default to 1 instead of 0.) This would be the best way to reuse code IMO.
Follow HF and have separate methods for sampling and greedy. From an adoption perspective, this might be beneficial as users of torchtext tend to also have experience with HF; however, there would certainly be some duplicated code.
Thoughts? @yohann-benchetrit
Will want to add temperature, but can do that in a follow-up PR. Tracking issue: #2138
High level, I notice you implemented the sampling as part of the
greedy_search
function. Technically, with sampling this name is incorrect. I see a couple of ways of going about this.
- Rename the function to a more generic
sample
method and remove thedo_sample
parameter to avoid confusion. Then, iftop_k
ortop_p
was specified, a true sampling would be implemented. Otherwise, it would essentially be a greedy search. (in this case, as well, top_k should probably default to 1 instead of 0.) This would be the best way to reuse code IMO.- Follow HF and have separate methods for sampling and greedy. From an adoption perspective, this might be beneficial as users of torchtext tend to also have experience with HF; however, there would certainly be some duplicated code.
Thoughts? @yohann-benchetrit
I agree with @joecummings here in that by allowing users to provide top_k
or top_p
this method is no longer implementing a simple greedy search. In terms of which option to go with, I think for reusability, going for option 1 makes sense. However, I also think that it's more intuitive for users to be able to call a greedy_search
method without having to worry about modifying the top_k
or top_p
params or spend time understanding what sample
means? Most NLP users will have heard of beam_search
or greedy_search
but sample
might not be as intuitive.
An alternative proposal here could be to keep the greedy_search
method as is and then add the top_k
and top_p
params only to the beam_search
method. This way if users want to do greedy search with those 2 params set, they can just set the beam_size
to 1. For this approach, the greedy_search
method will be kept simple and intuitive and all the configurability would come from the beam_search
method.
I agree with @joecummings here in that by allowing users to provide
top_k
ortop_p
this method is no longer implementing a simple greedy search. In terms of which option to go with, I think for reusability, going for option 1 makes sense. However, I also think that it's more intuitive for users to be able to call agreedy_search
method without having to worry about modifying thetop_k
ortop_p
params or spend time understanding whatsample
means? Most NLP users will have heard ofbeam_search
orgreedy_search
butsample
might not be as intuitive.An alternative proposal here could be to keep the
greedy_search
method as is and then add thetop_k
andtop_p
params only to thebeam_search
method. This way if users want to do greedy search with those 2 params set, they can just set thebeam_size
to 1. For this approach, thegreedy_search
method will be kept simple and intuitive and all the configurability would come from thebeam_search
method.
@Nayef211 We abstract away the internal methods, so high-level users should never have to call greedy_search
or beam_search
directly. I see your point though; however, I don't think we should leave complexity to one function or the other. If we really want this to be completely composable and apply to all cases of sampling (which can happen in beam search, too), we may want to combine these methods into one. This seems beyond the scope of this PR though as beam_search
is not implemented on main right now.
Thoughts? @yohann-benchetrit
Agreed, thanks @joecummings and @Nayef211 for your comments !
So with this additional information my understanding is:
generate
is the "maximum generality" decoding-interfacegreedy_search
and beam_search
are (or will be) implemented on their own as public subroutines of generate
but they are not necessarily intended to be directly called by the end-user.My thoughts on this:
As @Nayef211 suggested and closer to your proposition 2 (HF), I would tend to also leave greedy_search
sampling-free and delegate 'sampling with a single beam' (i.e what I implemented here) to a beam_sample
method that will build on beam_search
once it will be in main
.
However, since beam_search
is not currently in main
, does this mean that this PR should solely restrict to adding _get_top_{k,p}_restriction
methods (with corresponding tests) ?
As a side note, although HF also recommends going solely for using generate
, I like the idea of having the fundamentals such as greedy_search
and beam_search
on the top of the API, for "no-extra-thought" usage.
- As @Nayef211 suggested and closer to your proposition 2 (HF), I would tend to also leave
greedy_search
sampling-free and delegate 'sampling with a single beam' (i.e what I implemented here) to abeam_sample
method that will build onbeam_search
once it will be inmain
. However, sincebeam_search
is not currently inmain
, does this mean that this PR should solely restrict to adding_get_top_{k,p}_restriction
methods (with corresponding tests) ?
I like this idea more and unless there are customers that are asking for sampling to be implemented in greedy_search
, I would hold off on adding these params to the method and instead just add the helper methods like you mentioned. However if customers are asking for this, I don't see any harm in adding the functionality to greedy_search
and removing it later down the line. Ultimately, I'll leave the decision to @joecummings.
@joecummings @Nayef211
Thanks again for your comments. I addressed them as follows:
_get_top_{k,p}_restriction
and deleted their usage in decoding (except for unit tests)._apply_temperature
method (and a unit test), as Joe was suggesting, addressing issue #2138 .
Add the following token-decoding helpers to
GenerationUtils.generate
:top-p
top-k
remove_invalid_values
temperature
(see Issue 2138)Add corresponding unit tests and docstrings.