feat: Diff-SR for image-based POMDP

Hi Haotian,

This PR comes with the code for Diff-SR, a new representation-based RL method that uses score matching for representation derivation, and Bo told us to open-source and merge the code into this codebase. The implementation is divided into two parts, one of them is for MuJoCo tasks (which I believe @dmitryshribak will upload soon) and the other one (in this PR) is for MetaWorld tasks. Given that the training procedure of MetaWorld tasks is somewhat different, I followed the practice of mulvrep to isolate the implementations into one separate directory. Please take a look and see whether we can merge this into the main branch.

haotiansun14 / rl-rep

feat: Diff-SR for image-based POMDP #3