Closed mseitzer closed 4 years ago
Skew-Fit modifies the data distribution used to train the VAE. Otherwise, the VAE is still a normal VAE, and so we sample goals by sampling from the VAE prior.
From the paper I got the understanding that next to modifying the data distribution for training the VAE, an equally important part of Skew-Fit is performing goal-directed exploration using the learned goal distribution q_\phi^G. Thus my question.
But I guess you can assume that the VAE prior kind of gives you the maximum entropy distribution for Pushing, as there the goal space is nicely bounded and axis aligned?
Just for my understanding, is using the VAE prior for this experiment mentioned in the paper somewhere? I guess in Table 3 it states that q_\phi^G is used for sampling goals.
Thanks for clarifying.
Hi,
I am wondering why the goal sampling mode for both the exploration goals and the relabeling goals are set to use the VAE prior instead of the distribution learned by Skew-Fit (i.e.
custom_goal_sampler
) in the Sawyer Push experiment. Would that not make it almost identical to RIG?https://github.com/vitchyr/rlkit/blob/20ea0820eb89bddae7c6a5171038a005e472c3d0/examples/skewfit/sawyer_push.py#L67-L69
Thank you very much!