Closed Binbose closed 1 year ago
Thank you for raising this issue, and apologies for the very late answer from me, I totally missed it. The issue was due to how parameters of a parametrization were defined. When PF and PB do not share parameters, the parameters of PF were not optimized. This should be fixed in https://github.com/saleml/gfn/pull/23.
Please let me know if this solves your problem (it should)
Hey,
I was running the example code to solve for the 2-dimensional Hypergrid, and I noticed that if I don't share parameters between the forward and backward model:
logit_PB = LogitPBEstimator(env=env, module_name='NeuralNet', torso=None)
the agent doesn't learn anything. Is this expected?