There are several use cases for synthetic data, this is about a case where the model author is willing to share the model with us but not potential data recipients.
It would be good if we in situations as such could support generating custom datasets for delivery to data consumers.
This can possibly be different use cases:
the model author says that N data points generated with the given seed are anonymous
generated content is not guaranteed to be anonymous and is thus to be considered as sensitive data
This is conceptually orthogonal to whatever the model is deterministic (i.e. whatever we can regenerate delivered content), but we should generally strive to encourage deterministic models and maybe don't want to support non-deterministic ones.
There are several use cases for synthetic data, this is about a case where the model author is willing to share the model with us but not potential data recipients.
It would be good if we in situations as such could support generating custom datasets for delivery to data consumers.
This can possibly be different use cases:
This is conceptually orthogonal to whatever the model is deterministic (i.e. whatever we can regenerate delivered content), but we should generally strive to encourage deterministic models and maybe don't want to support non-deterministic ones.