best-practice-and-impact / ons-spark

MIT License
9 stars 5 forks source link

[New page]: Add a page on generating synthetic data #143

Open emercado4 opened 2 months ago

emercado4 commented 2 months ago

Summary of Content

I think @NathanKelly-ONS suggested this already in one of our whiteboard sessions but just adding it in as an issue now as I think it might be helpful for people asking about reducing resource use for migration.

Page should cover methods and useful packages for generating synthetic data (eg. faker for Python and fakeR for R) to support the existing big data workflow page.

Language Version

No response

Can this suggestion be used in Pyspark and / or SparklyR (Can select multiple)

Pyspark, SparklyR

Code snippets

No response

Code of Conduct