Closed ankurpuri1981 closed 1 week ago
Hi there @ankurpuri1981
The SDV does a best guess effort during automatic metadata detection for types and table relationships and then provides convenience methods for updating the metadata to help you tweak and customize it. We've found this approach the best way to balance reducing friction (with best guess automatic metadata detection) with giving users this transparency and control over their metadata, ensuring higher quality synthetic data.
The sdtype is set to Unknown
when SDV can't cleanly assign a better sdtype and these fields are treated as PII fields (or personal identifiable information).
It looks like you've already found the metadata updating methods, but I'm also linking here as well so you have them handy: https://docs.sdv.dev/sdv/multi-table-data/data-preparation/multi-table-metadata-api#update-api
Out of curiosity, where does your source data live that you're trying to feed into the SDV? A database? An API end point? Flat files in a file store?
Hi there @ankurpuri1981 I hope my response was useful! I haven't heard from you in 2 weeks so I'm going to move forward with closing this issue out.
Feel free to open a new issue if you have more questions!
Environment Details
Please indicate the following details about the environment in which you found the bug:
Error Description
Certain attributes are mapped as Unknown SDType and we have to change the dtype using custom script. Other attributes are identified correctly. Attached the generated schema json file for reference. Also, for 2 tables, it did not identify the relationship, that we had to handle within the custom script.
Steps to reproduce
Use the input dataset attached to generate metadata for multitable schema and check the metadata json file.