The BayesGMMTransformer should be experimented with to improve performace. The current parameters (the weight_threshold and the default values passed to the BayesianGaussianMixture) should be experimented with and new default values should be chosen.
The code can also be sped up. The reverse_transform is already much quicker than the other two methods, and fit takes almost all of its time fitting the BayesianGaussianMixture, which is unavoidable. Instead, the biggest gains can be achieved by improving the transform method, specifically the following lines:
The
BayesGMMTransformer
should be experimented with to improve performace. The current parameters (theweight_threshold
and the default values passed to theBayesianGaussianMixture
) should be experimented with and new default values should be chosen.The code can also be sped up. The
reverse_transform
is already much quicker than the other two methods, andfit
takes almost all of its time fitting theBayesianGaussianMixture
, which is unavoidable. Instead, the biggest gains can be achieved by improving thetransform
method, specifically the following lines:https://github.com/sdv-dev/RDT/blob/6b07fee5f88d278c667ba0fef2d3729bb2d4195c/rdt/transformers/numerical.py#L625-L632
These lines take the majority of the transformation runtime, so any improvement would significantly speedup the whole process.