This requires custom configurations, as the default still uses 128.
Not fully sure this is a great idea, I need to test memory consumption in more depth.
The rationale for this change is that I'm following the steps that would be required to create a modern LM chat UI using Core ML, and see how far we can get.
This requires custom configurations, as the default still uses 128. Not fully sure this is a great idea, I need to test memory consumption in more depth.
The rationale for this change is that I'm following the steps that would be required to create a modern LM chat UI using Core ML, and see how far we can get.