epic: Jan Studio (users can finetune)

0xSage commented 10 months ago

HackMD

Motivation

Users need the ability to fine tune models on their own hardware (part of Create your own Assistant)
Users need a seamless experience for fine tuning models (not only LLM, but embedding for Retrieval, or Function calling etc) then can be used right away within Jan.
User can use this component via Studio extension on Jan suite (Jan desktop app, Jan server)
Jan studio should expose the API similar to OpenAI fine tuning API, but it also need to enable user with more fine tuning method, as well and converting models that Jan can support in #783
The targeted users
- Normies: Just click 1 button, everything will be filled by default.
- Power user (cc @hahuyhoang411 ) pls add.
We have to focus on very neat UX/UI that is better than existing solutions with Gradio for power users, but still allow power to extend what they need.

Specs

@hiro-v thinks we need to develop Studio decoupled from Jan app and abstract away the training engine and api in Python runtime in order to reuse existing Python ecosystem.
High level architecture (green is existed, yellow is to develop)
- Jan suite interact with Studio component with Jan studio extension and OpenAI compatible API
- Jan Studio is in docker environment, using python 3.9 that includes 1 webserver and 1 scheduler for background task. The training engines are abstract with trainingEngine class which helps users to fine tune on CPU, NVIDIA GPU or Apple MLX

Studio comes with API as follow

Create fine tuning job

POST http://studio.jan.ai/v1/fine_tuning/jobs

Input

    model: string - required
    training_file: file upload - jsonl (required)
    validation: [file upload (file), GSM8K (choose), TruthfulQA (choose)]
    hyperparameters: object

Output (ft_job)

    {
      "object": "fine_tuning.job",
      "id": "ftjob-abc123",
      "model": "gpt-3.5-turbo-0613",
      "created_at": 1614807352,
      "fine_tuned_model": null,
      "organization_id": "org-123",
      "result_files": [],
      "status": "queued",
      "validation_file": null,
      "training_file": "file-abc123",
    }

List fine tuning jobs
GET http://studio.jan.ai/v1/fine_tuning/jobs -> LIST[ft_job]
Retrieve fine-tuning job
GET http://studio.jan.ai/v1/fine_tuning/jobs/<:ft_id> ->ft_job`
Cancel fine-tuning
POST http://studio.jan.ai/v1/fine_tuning/jobs/<:id>/cancel -> ft_job (status: cancelled)

Jan Studio will use similar way to store data - your local FS

jan/
assistants/
models/
extensions/
logs/
settings/
threads/
studio/
jobs/
    logs.jsonl
    metrics.jsonl
    metadata.json
    files/
        training.jsonl
        validation.jsonl
        files.abc
    artifacts/
        model.bin
        model_2.bin
        model.fp6.gguf
        model.Q5.gguf

Designs

Mock up

Figma

Tasklist

[ ] Discussion to address possible technical blockage to finalize the specs - @janhq/engineers
- [ ] App Pod
- [ ] Foundry team: You should consider this as the thing we use everyday to automate our work
[ ] Jan studio in python
- [ ] Webserver fastapi + Scheduler (Celery) - @hiro-v
- [ ] TrainingEngine abstraction
- [ ] TrainingEngine -> unsloth
- [ ] TrainingEngine -> similar to unsloth but use MLX (optional)
- [ ] Converter job
  - [ ] transformer based -> GGUF - @hiro-v
[ ] Jan Studio extension - @louis-jan

Not in Scope

The MLops platform to fine tune/ train anything. We focus on LLM and Embedding models first (i.e: NLP based)

Appendix

Reference for OAI: https://platform.openai.com/docs/api-reference/fine-tuning/create
Workflow that Jan Foundry team has now in CI/ manual steps to gen data then fine tune models