Question about learning GFN on the checkerboard dataset

Hi, Dinghuai

Thanks for your great work.

I have a question about reproducing the result on the checkerboard dataset using the GFlowNet_Randf_TB. The result I obtained was not meaningful. However, the successful result (as in the paper) can be obtained by using the learned backward policy PB GFlowNet_LearnedPb_TB.

I am a little confused why a random backward policy cannot work on the checkerboard dataset. Because I think we can always find a right forward policy corresponding to a certain backward policy. Maybe I missed something to tune.

Any comment would be quite helpful. Thanks in advance!

Best, Shanchao

GFNOrg / EB_GFN

Question about learning GFN on the checkerboard dataset #1