sdv-dev / SDV

Synthetic data generation for tabular data
https://docs.sdv.dev/sdv
Other
2.21k stars 289 forks source link

I want to be able to use synthesizer models in a standalone way (just the model weights) #1970

Open srinify opened 2 months ago

srinify commented 2 months ago

Problem Description

A SDV public user trained a synthesizer in SDV 1.11, saved the model PKL file, and then tried to import it into SDV 1.12. During import, SDV was trying to load specific Faker classes and this caused an error because of the delta in Faker versions.

See full context here: https://github.com/sdv-dev/SDV/issues/1959

Potential Solution

Instead of saving this much object state and trying to recreate it on load, we could just save the model weights and load those. That honestly better represents the context needed for the model itself and could lead to less errors of this type that the user ran into.

srinify commented 1 month ago

Originally from https://github.com/sdv-dev/CTGAN/issues/358

Problem Description

I want to be able to call CTGAN models from my C++ code. Using the synthesizer PKL files doesn't work because of all the Python context that's included.

Potential Solution

Provide a way to export just the model weights as something more standalone to use.