Closed MustafaS993 closed 3 years ago
@MustafaS993 that looks like a bug on our end. Both dp_microbatches and batch_size are set to 64 when dp=True, but the number of dp_microbatches needs to to be less than batch_size and divide evenly into it. We'll patch this in the next release. In the meantime, you can adjust your config.
Check out some examples here: https://gretel.ai/blog/practical-privacy-with-synthetic-data
Try using this config as a start for your DP settings. We are planning to add a differential privacy template as a default config soon
config_template = {
"checkpoint_dir": checkpoint_dir,
"vocab_size": 0, # lower vocab size helps dp
"epochs": 50,
"early_stopping": True,
"learning_rate": 0.001,
"rnn_units": 256,
"batch_size": 4,
"predict_batch_size": 1,
"dp": True,
"dp_noise_multiplier": 0.001,
"dp_l2_norm_clip": 10,
"dp_microbatches": 1,
"overwrite": True
}
@zredlined Thanks for the quick reply, your config template has got thing running again! I will also check out the blog post for more information about the parameters.
I'm trying to train a dataset on the Create Synthetic Data blueprint. In the config template I have set
"dp": True
.I get this error:
From my understanding, the issue is from the tensors being processed by the differential privacy algorithm. I can train without issue when
"dp": False
, also worth knowing that I could train with"dp": True
on same blueprint notebook without issue 3 months ago.