instructlab / training

InstructLab Training Library - Efficient Fine-Tuning with Message-Format Data
https://pypi.org/project/instructlab-training/
Apache License 2.0
22 stars 47 forks source link

e2e test takes over an hour #328

Open RobotSail opened 3 weeks ago

RobotSail commented 3 weeks ago

This needs to be way quicker. Realistically we should always be running off of a precomputed dataset.

nathan-weinberg commented 2 weeks ago

Duplicate of https://github.com/instructlab/instructlab/issues/2496

ktam3 commented 1 week ago

@RobotSail can we close this in favor of https://github.com/instructlab/instructlab/issues/2496

RobotSail commented 1 week ago

@ktam3 We should keep this one as it's specific to the training repo and may not reflect the needs of the CLI repo.

nathan-weinberg commented 1 week ago

@RobotSail uh, the code is exactly the same:

how is this specific/different?

RobotSail commented 1 week ago

@nathan-weinberg What I'm saying is that the github action files defining the e2e tests are physically located in this repo. Although the script we're running lives in the instructlab repo, we still define the tasks here separately from the upstream repo.

RobotSail commented 1 week ago

We should keep this issue here as the needs of the training repo may be different from that of the instructlab repo & the SDG repo.