Open aldopareja opened 5 months ago
Thank you for writing this up! One small question -- what would you say makes it "research-oriented" vs. more general purpose (for research and beyond)? When I hear "research-oriented", I also hear "not for production use," which I'm pretty sure isn't your intention!
a "stable" version of the upstream repository will be the production branch. Whatever has been battle tested and shown to work. But the upstream repo should move faster.
and faster moving in training is necessarily research-oriented.
it's just moving the functions around and organizing, no core logic will be changed since this is as "edge" as you can get atm and it's what will be the first production-rated trainer (since we have tested that everything works as expected).
a "stable" version of the upstream repository will be the production branch. Whatever has been battle tested and shown to work. But the upstream repo should move faster.
OK. It sounds like you also have a git branch and release management strategy in mind? This seems important to capture somewhere.
branch vs tags vs forks. That should be discussed and decide whatever people think is better for the community.
This issue has been automatically marked as stale because it has not had activity within 90 days. It will be automatically closed if no further activity occurs within 30 days.
Redesigning the InstructLab Training Repository
We aim to redesign the InstructLab training repository with a focus on simplicity, modularity, and performance. The goal is to create a standalone, research-oriented trainer that can be used by anyone experimenting with new research directions.
Philosophy
The philosophy behind this redesign is to create a "small form factor and extremely fast (throughput-wise) trainer". This trainer should be easy to read, understand, and modify. Users might want to change how gradients are aggregated, create new samplers based on gradient statistics, support new types of Language Models (LLMs), optimizers, and so on.
By creating a trainer that is easy to use and modify, we hope to attract more users from the research community and foster faster innovation. We aim to provide an alternative to trainers like the one from Hugging Face, which, while comprehensive, can be complex due to its support for a wide variety of data formats, sharding strategies, accelerators, and abstractions.
Structure
The fine tuning trainer should be shallow, with a simple script-wise separation across the following areas:
By structuring the trainer in this way, each component can be modified (mostly) independently, making it easier for users to customize the trainer to their needs. This structure also makes the trainer easier to understand, as each component has a clear, well-defined role.
Advantages
This approach has several advantages: