kzl decision-transformer issues

kzl / decision-transformer

Official codebase for Decision Transformer: Reinforcement Learning via Sequence Modeling.

MIT License

2.33k stars 440 forks source link

issues

Newest

Newest Most commented Recently updated Oldest Least commented Least recently updated

about graph experiment

#28 ysymyth opened 2 years ago
0
batch sampling: only last tokens?

#27 Howuhh closed 2 years ago
1
Question: is it possible to use the same Decision Transformer for new training trajectories generation?

#26 danielgafni closed 2 years ago
2
State and Return preds input

#25 backpropper closed 2 years ago
1
undestanding use of rewards

#24 jeweinb closed 2 years ago
2
Return-to-go conditioning on Atari

#23 geekyutao closed 3 years ago
2
Possible misalignment in calculating rtg in Atari

#22 geekyutao closed 3 years ago
1
Regarding atari breakout results

#21 dido1998 closed 3 years ago
1
aligning action embeddings to other embeddings at line 237

#20 loct824 closed 3 years ago
3
Timesteps Shape

#19 MrShininnnnn closed 3 years ago
1
Add medium-expert dataset

#18 ChenDRAG opened 3 years ago
0
difference between two GPT models used in this repo?

#17 ChenDRAG closed 3 years ago
1
how to get the score of an expert policy and some other details

#16 TianhongDai closed 3 years ago
1
Misalignment supervision when predicting succesor state

#15 geekyutao closed 3 years ago
1
Application to multi-agent environment

#14 skull8888888 closed 3 years ago
1
RuntimeError

#13 ximinng opened 3 years ago
1
undefined name 'returns

#12 l3str4nge closed 3 years ago
2
Just a little correction. A parameter changed in script.

#11 shogi880 closed 3 years ago
1
remove shadowing

#10 enosair closed 3 years ago
1
Getting killed before loading any data

#9 YashButala closed 3 years ago
1
After loading 50 trajectories, the terminal shows `killed`

#8 SunHaoOne closed 3 years ago
8
Citation request

#7 222464 closed 3 years ago
2
Is this a bug?

#6 yuanmao closed 3 years ago
1
state and action prediction

#5 mehdimashayekhi closed 3 years ago
4
Do both BC and DT fit the training data well?

#2 w-hc closed 3 years ago
1
You forgot to include wandb

#1 zitterbewegung closed 3 years ago
0