Last weekend I ran 5 fine tuning versions. Sadly Llama2 don't hold a full epoch of training and Caleuche ran out of VRAM
I uploaded the models to Ollama and try to generate from there
I have adapters and merged models for 3 the versions (Adapters run on top on base LLM, and merged is a new LLM) I'm checking latency to decide which version to run questions and upload to a table on Postgres during the weekend so Ale-Nico's evaluation framework can run next week
Fine-Tuning Update: