ikostrikov / implicit_q_learning

MIT License
226 stars 38 forks source link

Code for Behavior cloning policy #11

Open return-sleep opened 11 months ago

return-sleep commented 11 months ago

Could you please provide the code implementation related to BC in Table 1 of the paper? It looks like it gets great performance in walker2d-medium-expert-v2 dataset and is better than BC in other papers, e.g., Transformer Decision. Some description of the implementation would also be very useful for me. Thank you very much for your outstanding work.