I've been experimenting with GAN-based synthetic data generation using your repository's modules. Specifically, I've been working with the documentation provided for these modules. However, I encountered an issue when attempting to generate synthetic datasets with epsilon values smaller than 0.2.
Here's the error I encountered:
ValueError: Inputted epsilon parameter is too small to create a private dataset. Try increasing epsilon and rerunning.
For context, I was performing experiments using the Adult dataset, where I've already transformed numerical attributes into categorical ones.
From what I understand, the moment accountant is used in the PATE-GAN to accurately track privacy loss over multiple training iterations. Despite this, I'm still puzzled by why I'm facing this error for epsilon values less than 0.2.
Below is the code snippet I used for generating the dataset:
import pandas as pd
import numpy as np
from snsynth.pytorch.nn import PATEGAN
from snsynth.pytorch import PytorchDPSynthesizer
adult_path = './adult.csv'
adult = pd.read_csv(adult_path, index_col=None)
delta = 1e-9
synth = PytorchDPSynthesizer(0.1, PATEGAN(epsilon=0.1, delta=delta), None)
synth.fit(adult, categorical_columns=adult.columns.values.tolist())
sample = synth.sample(len(adult))
print(sample)
Could you please help me understand the reason behind this error and how I might resolve it for epsilon values less than 0.2?
Thank you for your assistance.
Hello,
I've been experimenting with GAN-based synthetic data generation using your repository's modules. Specifically, I've been working with the documentation provided for these modules. However, I encountered an issue when attempting to generate synthetic datasets with epsilon values smaller than 0.2.
Here's the error I encountered:
For context, I was performing experiments using the Adult dataset, where I've already transformed numerical attributes into categorical ones.
From what I understand, the moment accountant is used in the PATE-GAN to accurately track privacy loss over multiple training iterations. Despite this, I'm still puzzled by why I'm facing this error for epsilon values less than 0.2.
Below is the code snippet I used for generating the dataset:
Could you please help me understand the reason behind this error and how I might resolve it for epsilon values less than 0.2? Thank you for your assistance.