Closed manurajmr1 closed 2 months ago
Hi @manurajmr1 nice to meet you.
Since your questions are related to our free vs. paid plans, I would encourage you to reach out here for more clarity. This GitHub is primarily meant for trackings bugs and troubleshooting code in the free version.
To answer your Qs briefly:
Hi @manurajmr1, I'm closing off this issue since we've pointed you to the right venue for this type of Q.
If you need help troubleshooting any code when using SDV, please feel free to file a new issue here with any related code snippet(s).
Does SDV Enterprise solve the performance issue? SDV-free sdk takes too much time to generate data.
Environment details
Problem description
Hi i wanted to use SDV for generating multi table data. For that i used HMASynthesizer which is free. SO i have few queries on this tool. Is the free synthesizer HMASynthesizer felt slow while testing? how much scalability it can provide like example 1M data with constraints it took for me around 3 days. in the doc it says - The HMA Synthesizer uses hierarchical ML algorithm to learn from real data and generate synthetic data. The algorithm uses classical statistics. which means it doesnt leverage GPU right? as its not neural network. Also i could find other synthesizers paid ones like HSASynthesizer, IndependentSynthesizer etcc, does that leverage GPU if i use these ones which support a neural net synthesizer. And how much time will it take to generate a synthetic data around 1M with 2 tables 5 columns each in two tables and maintain a primary key foriegn key relation between tables, with a date constraint like hotel booking < hotel checkout date. Is there a trail for paid version is available? to see it support GPU neural training, and to see it support also parallelism like distributing the load into multiple GPUS for faster performance (as pytorch by default support this).
What I already tried
I tried the Multi table Data use case and saw the process is slow, but the quality of data generated is good.