Open russellb opened 3 weeks ago
The output to resolve this issue I'd like to see at a minimum:
- Upstream repo has a training framework for both pretraining and finetuning
- Upstream repo supports a lot more model architectures and systems optimizations
- All of this is not needed for ILAB
- The current fork only has the relevant functionality (GPTDolomiteModel class) to use the Padding Free transformer optimization.
Only needing a subset of a library is incredibly common. That doesn't justify a fork.
I see from https://github.com/instructlab/training/pull/55 that a blocking issue was that the original repo does not have published releases on pypi, which is a blocker for instructlab. That would be a reason for a temporary fork, at least, if we were unable to get the original library released quick enough for our needs. If it's just that, the fork could go away as soon as the upstream library makes its own releases.
I assume there's still more to it that I'm missing?
We need to publish GPTDolomite as a package and consume it in training. Since the dolomite-engine repo owned by IBM has more than just the model, we pulled out the model and created it in this repo.
We need to publish GPTDolomite as a package and consume it in training. Since the dolomite-engine repo owned by IBM has more than just the model, we pulled out the model and created it in this repo.
but if that repo had its own published releases, would that have solved the main issue?
or did the extra features in the upstream repo cause a problem of some kind? If so, are those problems tracked somewhere?
Let me layout a couple more reasons why we need this:
We will be including more documentation on what this repo is and how it fits into the training repository itself.
OK - my takeaway is that the upstream project is not usable by InstructLab:
So, we had no choice but to fork it and maintain our own version of the subset needed by InstructLab?
Is that about right?
Long-term we should come up with a better solution but for RHEL AI GA this will be fine.
no suggestion for an immediate change -- I just wanted to make sure the context was recorded so the reason for the fork is clear to those that come behind us. I'll propose a README update as a resolution to this issue.
I can't find any explanation for why this repository was necessary. Maybe it's in a PR somewhere? The README points to the original repo, but doesn't explain why a fork is necessary.
The typical process for adopting a new repository under InstructLab is a proposal to the
instructlab/dev-docs
repository and getting it approved by the Oversight Committee. This shouldn't block any work, as all technical work could be done in a personal fork of the repo in the meantime.Some comments in this issue with some background would help me provide guidance for what to do from here.