- [X] I have searched the existing issues
### Is your feature request related to a problem? Please describe it
- llama.cpp Settings (e.g. attention) should be consistent across llama.cpp, Cortex and Jan
- From an Eng perspective, we should ensure llama.cpp settings get bubbled up to Cortex and Jan
### Describe the solution
- [ ] Identify all relevant model settings that need to be synced
- [ ] Design a common format for representing these settings across all projects
- [ ] Jan Model Settings should follow common format
- [ ] Cortex should allow user to pass inference-time and runtime parameters
- [ ] Process for llama.cpp updates (who should this be driven by?)
### Teachability, documentation, adoption, migration strategy
-
### What is the motivation / use case for changing the behavior?
Goal
Tasklist
Cortex
Jan
Related
Original Post