instructlab / sdg

Python library for Synthetic Data Generation
https://pypi.org/project/instructlab-sdg/
Apache License 2.0
24 stars 37 forks source link

Remove system prompt from data generation #96

Open oindrillac opened 4 months ago

oindrillac commented 4 months ago

Remove system prompt from data generation and will be re-introduced in the mixing phase.

https://github.com/instructlab/sdg/blob/b28a12bb647ef72f2b152051fc73d55c5a30da98/src/instructlab/sdg/generate_data.py#L38

shivchander commented 4 months ago

+1, would be good to introduce system role during the data mixing phase which prepares the dataset for training - this makes it a tad bit cleaner to understand - as the system role is only applicable to training

github-actions[bot] commented 2 days ago

This issue has been automatically marked as stale because it has not had activity within 90 days. It will be automatically closed if no further activity occurs within 30 days.