thu-ml / tianshou

An elegant PyTorch deep reinforcement learning library.
https://tianshou.org
MIT License
7.65k stars 1.11k forks source link

How to train a offline BCQ model with a custom logged data? #1040

Open ericyue opened 5 months ago

ericyue commented 5 months ago

I am working in the field of reinforcement learning research, particularly in medical applications.

My question is about using pre-collected offline data (encompassing state, action, next state, and reward) for constructing a logged_dataset. I noticed that the documentation mostly focuses on simulated data with some envs. However, my dataset is offline and pre-collected.

Could you provide guidance or share best practices on how to train with offline datasets BCQ model?

Trinkle23897 commented 5 months ago

You can make a buffer, load the data to RAM and reformat to be ReplayBuffer-compatible, and save it.

This is a great example to start with: https://github.com/thu-ml/tianshou/blob/4756ee80ff11cd8692aef3752f35c0af60a452e8/examples/offline/convert_rl_unplugged_atari.py