How to train a offline BCQ model with a custom logged data?

[x] I have marked all applicable categories:
- [x] exception-raising bug
- [x] RL algorithm bug
- [x] documentation request (i.e. "X is missing from the documentation.")
- [x] new feature request
- [x] design request (i.e. "X should be changed to Y.")
[x] I have visited the source website
[x] I have searched through the issue tracker for duplicates

[x] I have mentioned version numbers, operating system and environment, where applicable:

import tianshou, gymnasium as gym, torch, numpy, sys
print(tianshou.__version__, gym.__version__, torch.__version__, numpy.__version__, sys.version, sys.platform)

I am working in the field of reinforcement learning research, particularly in medical applications.

My question is about using pre-collected offline data (encompassing state, action, next state, and reward) for constructing a logged_dataset. I noticed that the documentation mostly focuses on simulated data with some envs. However, my dataset is offline and pre-collected.

Could you provide guidance or share best practices on how to train with offline datasets BCQ model?

thu-ml / tianshou

How to train a offline BCQ model with a custom logged data? #1040