nhsengland / NHSSynth

Package to accompany P41
https://nhsengland.github.io/NHSSynth/
MIT License
4 stars 1 forks source link

Improving the Python API #72

Closed HarrisonWilde closed 1 year ago

HarrisonWilde commented 1 year ago

We are currently at:

# should be easy to clean these up at some point, not important
from nhssynth.modules.dataloader.metatransformer import MetaTransformer
from nhssynth.modules.model.models import DPVAE

import pandas as pd

data = pd.read_csv("data/support.csv")
# This step is technically optional, we could instead do MetaTransformer(df) but then it will auto-generate metadata (which is imperfect)
mt = MetaTransformer.from_path(data, "data/support_metadata.yaml")
prepared_data = mt.apply()

# It makes far more sense (in my opinion) to instantiate the model with the data, as this determines the structure of the instantiated object + the thing we want to save etc.
model = DPVAE.from_metatransformer(prepared_data, mt)
results = model.train()

synthetic_data = model.generate(1000)

This is not exactly what #27 set out to achieve, but we are 90% of the way there. I will reopen a smaller issue in extended implementation to clean this up further.