keiohta / tf2rl

TensorFlow2 Reinforcement Learning
MIT License
461 stars 104 forks source link

problem with load_trajectories function in experiments/utils.py #145

Closed naji-s closed 2 years ago

naji-s commented 2 years ago

I don't understand the logic of having max_step-1 in line 51 of experiments/utils.py in the body of the function load_trajectories: https://github.com/keiohta/tf2rl/blob/82d9eecda78e22021efa0821bf02429ac7827f4d/tf2rl/experiments/utils.py#L40

the line I am referring to is: https://github.com/keiohta/tf2rl/blob/82d9eecda78e22021efa0821bf02429ac7827f4d/tf2rl/experiments/utils.py#L51

I think having actions[:max_steps-1] makes the actions one element shorter than the states one, whereas there is no need for that after you did an adjustment in lines 47 through 48.

ymd-h commented 2 years ago

@naji-s Thank you for reporting.

For better discussion, could you open an issue based on your problem from the next time?

At first glance, I feel this issue is closely related with #144.

We would like to know the true problem you encountered.

If you give us minimal reproducible example code, we can reduce investigation and we will provide fixed version sooner.

(FYI: If your motivation is "question", please use disscussions. We want to use this issue for development bug tracker.)

naji-s commented 2 years ago

I thought the logic of what I pointed out, namely a function argument that is never used in the body of the function is enough to report an issue. But here is the minimal example. Suppose you already ran DDPG on pendulum-v0. The following should return results of length 300:

expert_path_dir = 'results/20210804T173752.061185_SAC_'
from tf2rl.experiments.utils import restore_latest_n_traj
expert_trajs = restore_latest_n_traj(
    expert_path_dir, n_path=20, max_steps=300)

Instead it will return a dictionary with arrays of length 1000, the original length of the trajectories (because max_step is never used in restore_latest_n_traj).

Now here is when it gets connected to #144. But that's for itself another issue with a different function and line.

ymd-h commented 2 years ago

@naji-s Please, do not close until the problem is resolved. My PR is still being reviewed.

Moreover, we would like you to check whether your problem is fixed or not after newer version will be released.