[Question] Reward net transfer

Hello,

I want to run IRL on a task with some expert demonstrations. The demonstrations are a bit old, and since then, the action space action has increased. For instance, in the first version of the task there were only 5 actions, whereas in the new version there are 3 new actions that can be taken. Is it possible to train a reward net using the existing expert demonstrations (e.g. using AIRL) and then used the trained reward net to train a new policy considering the added actions? If so, I'm not entirely sure how it would look like when creating a RewardNet class.

I would appreciate some help.

Thanks in advance.

HumanCompatibleAI / imitation

[Question] Reward net transfer #838