hakuhodo-technologies / scope-rl

SCOPE-RL: A python library for offline reinforcement learning, off-policy evaluation, and selection
https://scope-rl.readthedocs.io/en/latest/
Apache License 2.0
106 stars 10 forks source link

Load a custom logged data (without pscore) to train a BCQ (or others) model? #26

Open ericyue opened 5 months ago

ericyue commented 5 months ago

Could you provide a more detail jupyter notebook about how to load a custom logged data (without pscore) to train a BCQ (or others) model? it will be very helpful!

aiueola commented 4 months ago

Hi @ericyue,

Thank you for reaching out. If you are interested in only learning a policy (i.e., not aiming at doing off-policy evaluation (OPE)), you can use the following transformation of the logged data:

offlinerl_dataset = MDPDataset(
    observations=train_logged_dataset["state"],
    actions=train_logged_dataset["action"],
    rewards=train_logged_dataset["reward"],
    terminals=train_logged_dataset["done"],
)

(See also: https://scope-rl.readthedocs.io/en/latest/documentation/quickstart.html)

This does not require "score", so it should work on your dataset.

ericyue commented 4 months ago

@aiueola thanks for replying! I want to do OPE too, I meet the same error here, can you give some help for this? https://github.com/hakuhodo-technologies/scope-rl/issues/25