sdv-dev / SDV

Synthetic data generation for tabular data
https://docs.sdv.dev/sdv
Other
2.3k stars 303 forks source link

Data Generation with four relational tables going out of memory #263

Open abhisheknagar1983 opened 3 years ago

abhisheknagar1983 commented 3 years ago

Description

We tried to generate data with SDV for four relational tables. Below are the details

Total rows for training 500 Apporx Total columns : 15 Memory 32 Gb

Training runs for 2-3 hours and after that we get the out of memory error. The CPU and memory consumption goes very high during the training.

What I Did

Paste the command(s) you ran and the output.
If there was a crash, please include the traceback here.
abhisheknagar1983 commented 3 years ago

Just to add this was a multiparent scenario, where we have three child tables and one parent table. We are trying with very minimal training data set. But still CPU and memory consumption goes very high during the training and after couple of hours it goes out of memory. Please let me know in-case you need more info.

Wim65 commented 3 years ago

Small Errata : 3 parents / 1 child

abhisheknagar1983 commented 3 years ago

For SDV Do we have any guidelines for memory requirements, by which we can estimate the memory(minimum) needed?