WilsonWangTHU / mbbl

387 stars 69 forks source link

Stochastic Value Gradients implementation #5

Open proceduralia opened 5 years ago

proceduralia commented 5 years ago

Hello,

thank you for your hard work in providing and polishing this repository. I would like to replicate the results using Stochastic Value Gradients. I see in the readme that it will be set public: can you estimate how much time will it take?

Thank you,

Pierluca

0xangelo commented 4 years ago

Hi there,

I am also interested in reproducing the results for SVG(1). I have my own implementation of SVG(1), but some details about the implementation in the original paper are a bit obscure to me.

Specifically, it's not entirely clear to me how the KL regularization is performed and how the KL penalty is chosen and updated. I believe this plays a crucial role in the stability of the algorithm.

Therefore, I would love to take a look at the implementation used here and would be extremely grateful if you made it public.

Best regards, Ângelo