ChakshuGautam commented 4 months ago

[ ] Model Version Management (commit hash, semantic version) - should happen while training
[ ] Provide model files (onnx, pt, bin) through a CDN
[ ] Rollback to an older version
[ ] Deployment by a version number
[ ] Track costs during training

Clicking train button on Admi Panel

ML Pod:

[ ] #142

Admin panel :

[ ] when train button is clicked ,it'll hit model registry API to get the:
- Base Model Branch on HF - the base model which will be used to train the dataset with
- task_type: classfication/NER etc
- model_format: onnx/pytorch - safetensors
- model_name (purpose for which model is getting trained ) like agri_classification in AKAI/KMAI {can be same as service_name}
- epochs (number of epochs the model is getting trained for)
- args : training arguements used to fine tune the model
- quantization: None mostly unless specified)
[ ] Admin Panel will hit dataset registry to get dataset id for the given model-botid

[ ] Admin Panel will hit /train API with the following parameters:


{
"model": Base Model Branch on HF (from model registry)
"epochs":  (from model registry)
  "task_type":  (from model registry)
"dataset":  (from dataset registry)
    "versioning": {
     "owner": botid   
    "environment": bot environment 
    “model_name ': (from model registry) 
},
“args”: (from model registry) 
}



## Dataset service:  

-  To create dataset for models with the following for each model-botid t least : 

     -  Base Model Branch on HF - the base model which will be used to train the dataset with 
     - task_type:  classfication/NER etc 
     - model_format: onnx/pytorch - safetensors 
     - model_name (purpose for which model is getting trained ) like agri_classification in AKAI/KMAI {can be same as service_name}
     - epochs (number of epochs the model is getting trained for) 
     - args : training arguements used to fine tune the model
     - quantization: None mostly unless specified) 

- to create dataset for datasets with : 
 datasetid for each model for each bot

ChakshuGautam commented 3 months ago

@suresh12 to review the Doc

Gautam-Rajeev commented 3 months ago

Clicking train button on Admi Panel

ML Pod:

[ ] Modify the train API to support versioning

Admin panel :

[ ] when train button is clicked ,it'll hit model registry API to get the:
- Base Model Branch on HF - the base model which will be used to train the dataset with
- task_type: classfication/NER etc
- model_format: onnx/pytorch - safetensors
- model_name (purpose for which model is getting trained ) like agri_classification in AKAI/KMAI {can be same as service_name}
- epochs (number of epochs the model is getting trained for)
- args : training arguements used to fine tune the model
- quantization: None mostly unless specified)
[ ] Admin Panel will hit dataset registry to get dataset id for the given model-botid

[ ] Admin Panel will hit /train API with the following parameters:


{
"model": Base Model Branch on HF (from model registry)
"epochs":  (from model registry)
  "task_type":  (from model registry)
"dataset":  (from dataset registry)
    "versioning": {
     "owner": botid   
    "environment": bot environment 
    “model_name ': (from model registry) 
},
“args”: (from model registry) 
}



## Dataset service:  

-  To create dataset for models with the following for each model-botid t least : 

     -  Base Model Branch on HF - the base model which will be used to train the dataset with 
     - task_type:  classfication/NER etc 
     - model_format: onnx/pytorch - safetensors 
     - model_name (purpose for which model is getting trained ) like agri_classification in AKAI/KMAI {can be same as service_name}
     - epochs (number of epochs the model is getting trained for) 
     - args : training arguements used to fine tune the model
     - quantization: None mostly unless specified) 

- to create dataset for datasets with : 
 datasetid for each model for each bot

KDwevedi commented 2 months ago

Scoping Model Registry from ML Flow us lift and use directly.

Desirable Features

Model Metadata Storage
Version Management + Finetuning
Deployment
Utilising CDNs for making BIN and other model files available
Metrics for model use recorded
[ ] PoC

BharatSahAIyak / autotune

Model Registry #125

Clicking train button on Admi Panel

ML Pod:

Admin panel :

Clicking train button on Admi Panel

ML Pod:

Admin panel :

Desirable Features