Open mcleo-d opened 4 years ago
The first thing I suspect we want to do is to define the architecture of the system - i.e. what are the components required in a synthetic data flow
Data Set Attribute predictor - i.e. what are the columns and types, relationships between datasets (i.e. Counterparties, Securities and Trades are related sets)
Best Model Selector - based on a dataset and knowing the attributes, whats the best analysis module
Analysis Model Output - GAN, Markov processes, etc etc.
Hand Crafted Rules Specification - Allow for sketching out a dataset by hand - or hybrid (use a model as input then add to that model with additional data)
I think we can leave UI/Services out for the moment, that's just how things get hosted
<img src='https://g.gravizo.com/svg? /* Structural Things @opt commentname @note Notes can be extended to span multiple lines */ class Structural{}
/* @opt all @note Class / class Counter extends Structural { static public int counter; public int getCounter%28%29; }
/* @opt shape activeclass @opt all @note Active Class */ class RunningCounter extends Counter{} '>
Description
DataHub and DataHelix to start reviewing, agreeing and defining common API end points so the two projects can interoperate and grow an ecosystem for other Synthetic Data Generation projects to join.
Success Criteria