vanderschaarlab / synthcity

A library for generating and evaluating synthetic tabular data for privacy, fairness and data augmentation.
https://www.vanderschaar-lab.com/
Apache License 2.0
445 stars 58 forks source link

Add support for Adversarial Random Forest generative models #191

Closed robsdavis closed 1 year ago

robsdavis commented 1 year ago

Feature Description

Add support for an ARF generator using the library arfpy available on github, and pypi

ZhaozhiQIAN commented 1 year ago

Note that the documentation of pyarf states that "ARFs naturally handle unstructured data with mixed continuous and categorical covariates."

I tested the code and it can handle categorical columns without pre-processing (e.g. the dataframes with string columns). Hence, in the implementation there's no need to use the TabularEncoder class (which is different from the GANs).

Also, conditional generation is currently not implemented for ARF - let's raise an exception if the user attempts to sample conditionally.