keras-team / keras-nlp

Modular Natural Language Processing workflows with Keras
Apache License 2.0
758 stars 227 forks source link

Add Mistral 0.2 models as possible presets #1515

Closed borisdayma closed 5 months ago

borisdayma commented 5 months ago

Is your feature request related to a problem? Please describe.

We can currently load Mistral 7b models with keras_nlp.models.MistralCausalLM.from_preset("mistral_7b_en") (or mistral_instruct_7b_en). I noticed those are the version 0.1 of the models. The versions 0.2 have significantly improved while using the same code base.

Describe the solution you'd like

It would be nice to offer the 0.2 variants of both base and instruct models as possible presets.

Describe alternatives you've considered

Just using the 0.1 versions or other libraries… However Keras offers nice integration with JAX + sharding!

Additional context

I want to use Keras models as backbones to train VLM’s and Mistral is a very strong variant at 7B.

tirthasheshpatel commented 5 months ago

Thanks for the report! Will add the new presets.

borisdayma commented 5 months ago

Thanks a lot! Btw only the instruct variant has a version 0.2 but it's quite improved from 0.1

tirthasheshpatel commented 5 months ago

Thanks for this info. #1520 adds the preset. It should be available on Kaggle in around an hour and will be accessible in the next release of KerasNLP.