Open TanGentleman opened 1 month ago
One solution is to define some dictionaries in constants.py, and make reference to them for each LLM. So it would be (LLM) as a default, or (LLM, HS2) for Hyperparameter Schema 2, with it being easy to add new schemas to reuse them as components instead of changing settings.json manually each time.
This is my (imo) elegant way to handle all new CLI flags or API parameters in the future. Since I have defined Schemas for all different RAG, Chat, and Hyperparameter settings, I want to make an easier way to override any of the values in settings.json. Essentially, it would be nice to quickly run
python chat.py -m "model-name"
orpython chat.py -t 500
to limit the response tokens to 500. Adding all these settings adds bulk that I want to streamline and plan for beforehand.For instance, compared to the start of this project, there's now a ton of configuration settings!
Current settings.json file:
Note that to add things like custom stop tokens or temperatures for certain substeps, I run into an annoying problem with how this is organized. My plan is to create a simple schema for hyperparameters that can be configured and used flexibly by any of the LLMs.
I also want to create a new function to save a custom configuration as a file like settings.json that can be run in the future. I would like it to be something akin to the Vectara Ingest crawlers with YAML configs for running a workflow.
Perhaps in the future, a YAML may contain:
This last step would be very powerful. Setting up a YAML config with 1000+ websites or documents in the input field, and having it being able to create a database with meaningful metadata or analysis is very tricky to streamline, but immensely valuable!