Reproduce the Hyper-grid results in GFlowNet paper

mbi2gs / gflownet_tf2

Generative Flow Network demo in Tensorflow2

MIT License

7 stars 4 forks source link

Hi,

Thanks for sharing such a beautiful code! It is really helpful for me to learn how to implement GFlowNet.

Here I have some questions.

In your notebook, the learned policy can reproduce the reward grid quite well with l2 as low as 0.2. However, I simply clone the repository and run the cells, the result is not so close with l2 being 0.7 or so. Below is the result I have. I wonder if there is anything I have missed.

I go through the gfnAgent class. In line 330, the probabilities of forward and backward should take the log before being summed with z0. However, it was added directly. Can you help me understand that part?

Thanks again for such great code!

Best, Cancan

mbi2gs / gflownet_tf2