In some scenario, it might be helpful if we can provide some statistic tool for the user to understand the quality of the syntehtic data generate by datahub.
Potential Solutions:
There's some existing tooling to do similar work, we can investigate how to integrate them with datahub. Ex: https://github.com/SauceCat/pydqc
Feature Request
Description of Problem:
In some scenario, it might be helpful if we can provide some statistic tool for the user to understand the quality of the syntehtic data generate by datahub.
Potential Solutions:
There's some existing tooling to do similar work, we can investigate how to integrate them with datahub. Ex: https://github.com/SauceCat/pydqc