gretelai / public_research

Where Gretel published notebooks and code for blog posts
Apache License 2.0
18 stars 8 forks source link

Results for WWT Autocorrelation and Attribute Distribution for doppelganger_pytorch.ipynb could not be reproduced for pytorch_fast #7

Closed rllyryan closed 1 year ago

rllyryan commented 1 year ago

Dear repository owners,

image image

I could not replicate the auto-correlation figure for the WWT dataset in the doppelganger_pytorch.ipynb notebook, could I ask why this is so? (I ran the notebook as it is)

I used Google Colab's V100 and A100 GPU hardwares. Not sure if it is due to some update in Pytorch or something else entirely.

Edited: image

This was different from the one shown in the notebook with greater number of barcharts.

Thank you!

kboyd commented 1 year ago

I'm not sure what's going on with the DGAN model on this data. I see the same poorer quality from the notebook across a few setups. Will need to investigate further.

Thanks for trying out our notebooks!

rllyryan commented 1 year ago

I'm not sure what's going on with the DGAN model on this data. I see the same poorer quality from the notebook across a few setups. Will need to investigate further.

Thanks for trying out our notebooks!

Thank you for taking a look into this! Maybe it could be due to the dynamic nature of Google Colab, things don't always go right on that web IDE.

But nonetheless, it is really great work in the reduction in runtime (I ran the original DoppelGANger in 12.5 hrs, which is too long in my opinion)

Please do let me know when it is ready to be tested again! Have a great week ahead :)

kboyd commented 1 year ago

Notebook is updated and should work as expected now.

We improved the gretel-synthetics package to return DataFrame columns in the same order as the training data, but hadn't updated the notebook. So it was using the wrong columns from the synthetic DataFrame to create the charts. The model was working fine, the visualization was off. But that's fixed in the notebook now.