caixunshiren / Highway-Decision-Transformer

Decision Transformer for offline single-agent autonomous highway driving
Apache License 2.0
20 stars 0 forks source link

How to set target return in training? #5

Open Fornerio opened 2 months ago

Fornerio commented 2 months ago

Hello,

I am trying to set target return to train my DT but code gives a bunch of errors. Adding elements to the target_envs list leads to errors in evaluate_episode_rtg, since the key 'env' is not defined anywhere. Adding other kinds of information about target return in the training pipeline, such as

for iteration in range(last_iter+1, config['max_iters']): outputs = trainer.train_iteration(num_steps=config['num_steps_per_iter'], iter_num=iteration+1, print_logs=True) evalrets = [outputs[f'evaluation/target{target_rew}_return_mean'] for target_rew in config['env_targets']] mean_ret = np.mean(eval_rets)

generates KeyError since, of course, no keys of that kind are generated when calling train_iteration.

I'm asking myself: how the training works if no target return is provided?

thanks