Thank you for compiling this list of papers. I'm interested in GflowNet and this is definitely a great resource.
I have a few questions about GflowNet, and I would appreciate if you can share your thoughts.
I understand that GflowNet's goal is to build a sampler so that the samples are proportional to a given reward function. However, if we raise the reward function to a high power, we turn the sampling problem to finding the best configuration. So can we use GflowNet to solve "best configuration" problems? Is there any disadvantage to do so? (comparing with say RL algorithms) .
So far most of works I read on GflowNet solve discrete space problem. I read in GflowNet foundation that all formulation can be easily adapted in continuous case. I'm not too sure how to do it. Say I have continuous state/action space problem. I think we can still use trajectory balance loss. Instead of having the forward transition probability, we would have the forward transition density function. The backward transition probability we can use uniform density. And we still need to learn the normalization term Z. Is this the case? Or there are other considerations when dealing with continuous case.
Yes, we can use GFlowNet (or any similar method such as variational inference) to solve the low temperature sampling problem (which is effectively an optimization problem). However, if we only want to find one optima, I think RL or black-box optimization problem (such as Bayesian optimization) would be better as they are designed for this aim.
I agree with what you said. As far as I know, there are already researchers doing some first trials on continuous gflownets; one example is the last two pages of the appendix of this work.
Hello Narsil,
Thank you for compiling this list of papers. I'm interested in GflowNet and this is definitely a great resource.
I have a few questions about GflowNet, and I would appreciate if you can share your thoughts.
I understand that GflowNet's goal is to build a sampler so that the samples are proportional to a given reward function. However, if we raise the reward function to a high power, we turn the sampling problem to finding the best configuration. So can we use GflowNet to solve "best configuration" problems? Is there any disadvantage to do so? (comparing with say RL algorithms) .
So far most of works I read on GflowNet solve discrete space problem. I read in GflowNet foundation that all formulation can be easily adapted in continuous case. I'm not too sure how to do it. Say I have continuous state/action space problem. I think we can still use trajectory balance loss. Instead of having the forward transition probability, we would have the forward transition density function. The backward transition probability we can use uniform density. And we still need to learn the normalization term Z. Is this the case? Or there are other considerations when dealing with continuous case.
Thank you