yandex-research / tab-ddpm

[ICML 2023] The official implementation of the paper "TabDDPM: Modelling Tabular Data with Diffusion Models"
https://arxiv.org/abs/2209.15421
MIT License
393 stars 86 forks source link

Loss goes to nan for more than 10 categorial variables #40

Open kara-liu opened 3 months ago

kara-liu commented 3 months ago

I am having an issue where TabDDPM does really well when the total number of variables are < 10-15, but for modest to high numbers of binary categorical variables (even over 100, although my goal is 1000s), the loss (both mloss and gloss) quickly go to nan. This doesn't seem to be an issue with my dataset - I get the same behavior when I make a synthetic dataset of gaussian continuous variables and Bernoulli(p=.5) binary variables. Any help would be appreciated.

Yunbo-max commented 3 months ago

I met the same issues

vuhoangminh commented 2 months ago

Same issue as well...