I have a question about reproducing the result on the checkerboard dataset using the GFlowNet_Randf_TB. The result I obtained was not meaningful. However, the successful result (as in the paper) can be obtained by using the learned backward policy PB GFlowNet_LearnedPb_TB.
I am a little confused why a random backward policy cannot work on the checkerboard dataset. Because I think we can always find a right forward policy corresponding to a certain backward policy. Maybe I missed something to tune.
Any comment would be quite helpful. Thanks in advance!
Hi, Dinghuai
Thanks for your great work.
I have a question about reproducing the result on the checkerboard dataset using the
GFlowNet_Randf_TB
. The result I obtained was not meaningful. However, the successful result (as in the paper) can be obtained by using the learned backward policy PBGFlowNet_LearnedPb_TB
.I am a little confused why a random backward policy cannot work on the checkerboard dataset. Because I think we can always find a right forward policy corresponding to a certain backward policy. Maybe I missed something to tune.
Any comment would be quite helpful. Thanks in advance!
Best, Shanchao