mbi2gs / gflownet_tf2

Generative Flow Network demo in Tensorflow2
MIT License
8 stars 5 forks source link

question regarding 'back_sample_trajectory' function #2

Closed kzk2000 closed 2 years ago

kzk2000 commented 2 years ago

Though the function is called 'back_sample_trajectory', it doesn't actually sample but always greedily goes for the action with the highest probability, see

https://github.com/mbi2gs/gflownet_tf2/blob/main/gfn.py#L205

Really asking to understand Gflownet implementations better: Shouldn't there be an action sampler similar to the one in the forward pass here: https://github.com/mbi2gs/gflownet_tf2/blob/aece0a5463dc0df4d1773bebdb136efbb35fe317/gfn.py#L152 ?

kzk2000 commented 2 years ago

I think I found my own answer - during training, there's no "actual" sampling as you simply use forward/backward pass on the "sampled" trajectories. so 'back_sample_trajectory' is really just 'backward_pass_trajectory'

mbi2gs commented 2 years ago

Yeah, and when I chose the word "sample" it's in the context of a backward policy that is being actively learned and may change on the next training batch. So, in some sense we're "sampling" the most likely trajectory from the current policy.

But you got the idea. Very close reading! Thanks for reaching out.