The synthetic data generation code was moved to a separate package instructlab.sdg so it can be consumed by multiple projects. The current project layout does not make it obvious which names are designed for public consumption with a stable API, and which are internal implementation details.
I recommend:
prefix all modules with _, e.g. _generate_data.py and _utils.py
import public names in __init__.py and include them in __all__ variable.
The synthetic data generation code was moved to a separate package
instructlab.sdg
so it can be consumed by multiple projects. The current project layout does not make it obvious which names are designed for public consumption with a stable API, and which are internal implementation details.I recommend:
_
, e.g._generate_data.py
and_utils.py
__init__.py
and include them in__all__
variable.