bnowok / synthpop

Generating Synthetic Versions of Sensitive Microdata for Statistical Disclosure Control
40 stars 8 forks source link

Saving and loading models #27

Open notna07 opened 1 year ago

notna07 commented 1 year ago

Working on synthetic data, and I must say CART models in synthpop works amazingly.

However, I wish to be able to save and load models for better comparison and replication. Additionally, it seems that many modern use-cases requires the fitted models to be able to generate data without using the real data as input. Doing a little exploration in the source code it seemed that there is an argument models for the syn functionality, that is turned off by default, that if changed produces something like;

$Species
n= 150

node), split, n, loss, yval, (yprob)
      * denotes terminal node

 1) root 150 100 setosa (0.33333333 0.33333333 0.33333333)
   2) Petal.Length< 2.45 50   0 setosa (1.00000000 0.00000000 0.00000000) *
   3) Petal.Length>=2.45 100  50 versicolor (0.00000000 0.50000000 0.50000000)
     6) Petal.Width< 1.75 54   5 versicolor (0.00000000 0.90740741 0.09259259)
      12) Petal.Length< 4.95 48   1 versicolor (0.00000000 0.97916667 0.02083333) *
      13) Petal.Length>=4.95 6   2 virginica (0.00000000 0.33333333 0.66666667) *
     7) Petal.Width>=1.75 46   1 virginica (0.00000000 0.02173913 0.97826087) *

for the Iris data. This is nice, but I cannot figure out where to put this $models object that can generate me new data using these tree structures.

I guess I am missing something, and would really appreciate it if someone knows the correct procedure.