Backward trajectory time steps in `intro_gfn_continuous_line.ipynb`

tsa87 commented 4 months ago

Thank you for this great library. In the tutorial notebook, for computing PB of trajectory, we only consider timestep range [trajectory_length, 2). Why do we stop at time step 2?

The code work and the performance seems to improve if we consider the timestep range [trajectory_length, 1). (Skipping modelling backward step of time 1 to time 0)

# Backward loop to compute logPB from existing trajectory under the backward policy.
for t in range(trajectory_length, 2, -1):
    policy_dist = get_policy_dist(backward_model, trajectory[:, t, :])
    action = trajectory[:, t, 0] - trajectory[:, t - 1, 0]
    logPB += policy_dist.log_prob(action)

josephdviviano commented 2 months ago

Thanks for raising this and sorry for the delay. I am looking into it now.

josephdviviano commented 2 months ago

OK. thanks. You identified a strange holdover bug from a long time ago. I've updated both notebooks (other than the incomplete one) with the solutions from the gflownet workshop. I think they are both easier to read, and for sure bug free. Sorry for the delay getting back to you.

Addressed here: https://github.com/GFNOrg/torchgfn/pull/187

josephdviviano commented 2 months ago

This is merged into main - please reopen if you continue to have issues.

GFNOrg / torchgfn

Backward trajectory time steps in `intro_gfn_continuous_line.ipynb` #182