instructlab / sdg

Python library for Synthetic Data Generation
Apache License 2.0
5 stars 13 forks source link

Add SDG library code #42

Closed aakankshaduggal closed 5 days ago

aakankshaduggal commented 6 days ago

This PR adds the code for the SDG library #41

In order to test the code, you can run the tests/test_knowledge.py, specify the model endpoint and run the file from the root like such - python3 -m tests.test_knowledge.py


Corresponding design doc: https://github.com/instructlab/dev-docs/blob/main/docs/sdg/sdg-api-interface.md

russellb commented 6 days ago

This PR adds the code for the SDG library #41

In order to test the code, you can run the tests/test_knowledge.py, specify the model endpoint and run the file from the root like such - python3 -m tests.test_knowledge.py

You'll need to move the test script to a different directory. It's getting picked up by the automation that runs unit tests. If you move it, it shouldn't run.

russellb commented 5 days ago

As of right now, none of this new code is being used, so it will not break the CLI (as confirmed by the e2e CI job).

Since this is so big, we're going to merge and keep working on it in tree.