sattarov / FinDiff

Implementation of the paper: "FinDiff: Diffusion Models for Financial Tabular Data Generation"
19 stars 2 forks source link

Categorical Features in initial reference implementation #1

Closed fountaindive closed 1 month ago

fountaindive commented 3 months ago

Hi, this looks really cool and thanks for adding an implementation online.

I'm playing around with the google colab example you have here (linked from the README) and I've found that the categorical columns are not modelled very well whereas the numerical columns are modelled very well. e.g. if you compare 1D histograms for each feature from the training data and the synthetic data.

Do you have any hints to improve the performance for categorical features?

Thanks!

sattarov commented 2 months ago

Hi,

Thank you for mentioning that and sorry for the late reply. The reason is that in the educational colab notebook the initial training was set to only 30 epochs. If you increase it by at least 500 you will already notice the improvement. I have updated the notebook correspondingly and also added the automatic switching to GPU to run the training faster. Feel free to check it and let us know if you encounter any issues.

Best regards, Timur

fountaindive commented 2 months ago

Ah ok thanks for getting back to me! I'll give it a go

sattarov commented 1 month ago

Hi,

I hope it helped. I close the issue due to inactivity. Feel free to reopen it if you encounter problems.

Best regards, Timur