Open Pavan-Kalyan1432 opened 1 week ago
Hi @Pavan-Kalyan1432 can you clarify what you mean by "repeating the combinations but it is not considering all the combinations"?
When generating synthetic data, using this constraint will ensure that the synthesizer will only use the same combinations of values in these 4 columns that exist in your real data. So, for example, if you only have rows containing the combination: "Jack", "John", "Jay", and "Jack John Jay" for your 4 columns, then this will be the only combination that will show up in the synthetic data.
For example it is repeating the same combination multiple times and also it is not considering all the combinations that are in real data
Hi @Pavan-Kalyan1432, if I may jump in here: The purpose of the FixedCombinations constraint is only to fix the combinations that are created. Adding this constraint will prevent new permutations from being synthesized in the columns you specify.
If you sample many many more times, then I think due to random chance, you will eventually end up creating all the combinations that were in the original data.
However, preventing repetition is not the purpose of this constraint. May I ask why you want to prevent the repetition in your data? This indicates to me that in your synthetic data, you just want the same exact same names to appear in the exact same rows as your real data. Is that correct? If you could provide more information on your usage (what are you trying to accomplish with synthetic data), we can better guide you to a solution. Thanks.
Environment details
If you are already running SDV, please indicate the following details about the environment in which you are running it:
Problem description
What I already tried
Here Fixed combinations is repeating the combinations but it is not considering all the combinations... What to do to make it consider all the combinations of first name, middle name, last name and full name of the real data