keras-team / keras-hub

Pretrained model hub for Keras 3
Apache License 2.0
803 stars 243 forks source link

Add ShieldGemma class to KerasHub #1974

Open RyanMullins opened 2 weeks ago

RyanMullins commented 2 weeks ago

Draft PR to facilitate discussion around #1973.

mattdangerw commented 5 days ago

Sorry for the delay! I've been on leave, just getting back. Still thinking on this, not sure what to do...

One potential option would be to expose a more robust tool for classifying with a language model. Something like a TextClassifierLM task class, that takes in a token -> class idx mapping, and an optional prompt template. Could be both fit() and predict() friendly, using regular the causal "supervised fine-tuning" for training. So you could use it in pure inference mode for shield gemma (we'd have to resave the model including this task config including the 0 -> yes, 1 -> no map), or use it to DIY fine tune a gemma classifier for say for any classification problem via this predict a single token setup.

Or we could go with something like this, that is totally ad-hoc. But totally ad-hoc tasks might run us into hot water elsewhere. For example huggingface is trying to generate code snippets for user uploaded KerasHub models in a reliable fashion, and having a more consistent cross model task flow definitely helps with that aim.