sodascience / metasyn

Transparent and privacy-friendly synthetic data generation
https://metasyn.readthedocs.io
MIT License
38 stars 9 forks source link

Improve description of the package, add examples, and use cases across disciplines (JOSS paper) #321

Open PetrKorab opened 2 months ago

PetrKorab commented 2 months ago

The package has detailed documentation, but the paper is quite difficult to read for a non-specialist audience, mainly in the second part. Please add example use cases of the package for researchers in various areas and leave part of the technicalities for the documentation.

It will help researchers across disciplines better understand the benefits of the package. Thanks!

misken commented 1 month ago

I concur. The documentation is quite detailed and no need to try to put too much of it into the paper. Specific usage examples from different domains might help readers envision their own use cases for metasyn.

misken commented 1 month ago

Just to follow up with a bit more on this issue. In addition to some additional specific usage examples in the paper, it would be a good future task to base some of the documentation examples on more realistic datasets. I know that many people are familiar with the Titanic dataset and it does have fields that fall into the category of things that might be necessary to obscure via tools like metasyn. However, it might really help potential users to see usage examples based on your real experience with using metasyn to create synthetic data. The fruit based examples don't really scream out for the need for synthetic data. :)