Evaluate on specified data

Peter-Devine commented 11 months ago

⚠️ Please check that this feature request hasn't been suggested before.

[X] I searched previous Ideas in Discussions didn't find any similar feature requests.
[X] I searched previous Issues didn't find any similar feature requests.

🔖 Feature description

I want to evaluate on data that may be distinct of the training data.

Currently, the evaluation data is a random sample of the training data, but I have a situation where I have a lot of training data from a slightly noisy Dataset A and then a very small amount of very high quality data from Dataset B.

I want to be able to train on Dataset A and evaluate on Dataset B.

✔️ Solution

When using a Huggingface dataset, it would be nice to use the actual validation set as the eval_dataset for training. This way, you could manually specify which data will be used in training and what will be used in validation.

I think some code would have to be refactored in https://github.com/OpenAccess-AI-Collective/axolotl/blob/main/src/axolotl/utils/data.py

Thanks!

❓ Alternatives

No response

📝 Additional Context

No response

Acknowledgements

[X] My issue title is concise, descriptive, and in title casing.
[X] I have searched the existing issues to make sure this feature has not been requested yet.
[X] I have provided enough information for the maintainers to understand and evaluate this request.

codiceSpaghetti commented 9 months ago

I would need this feature as well

JiyangZhang commented 9 months ago

Any updates on this enhancement? Thanks!

Peter-Devine commented 9 months ago

Bump. It would be really handy to be able to evaluate continuously on a specified dataset different from the training dataset so that we could control early stopping etc. based on the performance of a target task.

For example, if we are just training on unstructured text but evaluating on a small structured test dataset, this could help us find the optimal training amount of transfer learning for the target task.

Thanks.

NanoCode012 commented 6 months ago

Hey, PR #786 allows for test_dataset: now. We also have bench_dataset if you want to run benchmarks (more info: https://github.com/OpenAccess-AI-Collective/axolotl/issues/311#issuecomment-2028311885).

axolotl-ai-cloud / axolotl