ToruOwO / mimex

PyTorch implementation for all methods and environments in the paper "MIMEx: Intrinsic Rewards from Masked Input Modeling"
16 stars 1 forks source link

What kind of running instructions can reproduce the "noise" curve in the baseline method mentioned in the paper? #1

Closed LiuXing1122 closed 6 months ago

LiuXing1122 commented 6 months ago

Your work has been completed exceptionally well. May I ask what kind of instructions should be executed in the mimex dmc and mimex pixmc projects to reproduce the "noise" curve in the baseline mentioned in the paper? Because I did not find the implementation instructions for the noise method in the code. Thank you for your guidance.

ToruOwO commented 6 months ago

Thank you for your interest! The "noise" baseline is simply adding random action noise as in the original PPO implementation, so you can reproduce it by running without exploration (e.g. as in the no_expl config file).