keras-team / keras-nlp

Modular Natural Language Processing workflows with Keras
Apache License 2.0
734 stars 216 forks source link

make it easier to adjust dropout when loading gemma models #1620

Open josharian opened 2 months ago

josharian commented 2 months ago

Is your feature request related to a problem? Please describe.

I'm fine tuning gemma models. I'd like to be able to:

There is way to do either of these without reaching into the models and fiddling with the layers. And going from no dropout to some dropout (or vice versa) is particularly challenging, because the relevant layers need to be created/destroyed.

My current "easy" workaround is to manually edit the config on disk. (For from_preset, this means messing with the kaggle cache. For saved models, this means unzipping and then rezipping a .keras file.) This works but is, ummm...not something I'm proud of.

Describe the solution you'd like

One option would be an API that lets me override config values, such as dropout, at model load time (be that from_preset or keras.saving.load_model).

Another would be a set_dropout API that includes recursively setting dropout values as well as adding/removing dropout layers as appropriate.

Additional context

This issue is meant more in the spirit of an experience report than a feature request. I suspect(?) that adjusting dropout is (or should be?) a common desire, so might be a good thing to consider across models, not just for me, for gemma.