Load a custom logged data (without pscore) to train a BCQ (or others) model?

ericyue commented 9 months ago

Could you provide a more detail jupyter notebook about how to load a custom logged data (without pscore) to train a BCQ (or others) model? it will be very helpful!

aiueola commented 9 months ago

Hi @ericyue,

Thank you for reaching out. If you are interested in only learning a policy (i.e., not aiming at doing off-policy evaluation (OPE)), you can use the following transformation of the logged data:

offlinerl_dataset = MDPDataset(
    observations=train_logged_dataset["state"],
    actions=train_logged_dataset["action"],
    rewards=train_logged_dataset["reward"],
    terminals=train_logged_dataset["done"],
)

This does not require "score", so it should work on your dataset.

ericyue commented 9 months ago

@aiueola thanks for replying! I want to do OPE too, I meet the same error here, can you give some help for this? https://github.com/hakuhodo-technologies/scope-rl/issues/25

hakuhodo-technologies / scope-rl

Load a custom logged data (without pscore) to train a BCQ (or others) model? #26