MishaLaskin / curl

CURL: Contrastive Unsupervised Representation Learning for Sample-Efficient Reinforcement Learning
MIT License
561 stars 88 forks source link

Trouble reproducing results. Help much appreciated~ #27

Open sunchipsster1 opened 1 year ago

sunchipsster1 commented 1 year ago

Hello! Thank you so much for putting up this valuable resource! I was wondering if I may ask for some kind advice about replicating the results, which I have been unable to do.

Mainly, I have been testing CURL (using the default settings + command listed on https://github.com/MishaLaskin/curl) against CURL with the following lines commented out (which should give me pixel SAC):

    # if step % self.cpc_update_freq == 0 and self.encoder_type == 'pixel':
    #     obs_anchor, obs_pos = cpc_kwargs["obs_anchor"], cpc_kwargs["obs_pos"]
    #     self.update_cpc(obs_anchor, obs_pos,cpc_kwargs, L, step,0)

For [cartpole, swingup], I obtained ~ 850 for CURL but strangely I also obtained ~850 (and very quickly too) for pixel SAC. These results showing no difference were replicated over 5 seeds and very robust. Is my code change correct, or have I manipulated the code in the wrong way?

For the task [finger,spin] I obtained ~ 350 for both CURL and pixel SAC, also no difference.

Thank you in advance for the kind help! :)

LiuZhenxian123 commented 3 months ago

Hi! have you sloved this problem?I'm having a similar problem.