sdv-dev / SDV

Synthetic data generation for tabular data
https://docs.sdv.dev/sdv
Other
2.32k stars 304 forks source link

Use of incorrect parameter name in example #2041

Closed Burhanuddin-Nahargarwala closed 4 months ago

Burhanuddin-Nahargarwala commented 4 months ago

Environment Details

Please indicate the following details about the environment in which you found the bug:

The positive constraints has the parameter called strict_boundaries, but in the example strict is given, which results in the error: image image image

npatki commented 4 months ago

Hi @Burhanuddin-Nahargarwala thanks for filing the issue.

Seems like this is actually an issue in our documentation for this constraint. Please use the keyword 'strict_boundaries', not 'strict'.

Note that the parameters are listed correctly at the top of the docs page -- it is just the example that's incorrect. We will update the example in the docs.

image

Do you need a Positive constraint?

To re-surface our message from Slack: By default, SDV synthesizers will enforce that all the synthetic data will adhere to the same min/max boundaries as observed in the real data. This means that as long as your real data has values >=0, then the synthetic data will as well — no constraint is needed

I realize you are adding the constraint just in case there end up being negative values (because you want to know when this happens). If at all possible, I would recommend performing such a check yourself before using SDV. Constraints are only meant to be used if you don't have any other options available -- as adding them may impact performance and quality in different ways.

npatki commented 4 months ago

Docs are now updated. (You may need to refresh the page.

Thanks for reporting!

image