when will top-k off-policy be implemented?

awarebayes / RecNN

Reinforced Recommendation toolkit built around pytorch 1.7

Apache License 2.0

574 stars 113 forks source link

when will top-k off-policy be implemented? #3

Closed joseph-chan closed 4 years ago

joseph-chan commented 4 years ago

when will top-k off-policy be implemented? I was reading this paper and looking forward to its implementation released, haha

awarebayes commented 4 years ago

I am focusing on making the library usable/writing the docs right now. Once the package is in it's "mature" state, I will be adding new algorithms. You know, changing one function and rewriting it in ~5 files has never been cool. I d assume it would take a couple of weeks

awarebayes commented 4 years ago

Reinforce TopK works on top of existing working recommenders (DDPG, TD3, SAC, BCQ) and I am still working on these. Gonna keep your issue open till it's released.

joseph-chan commented 4 years ago

I am focusing on making the library usable/writing the docs right now. Once the package is in it's "mature" state, I will be adding new algorithms. You know, changing one function and rewriting it in ~5 files has never been cool. I d assume it would take a couple of weeks

It's definitively a great job, and many people will benefit from it, thanks for your great effort and I will pay attention to your cool work

tangbotony commented 4 years ago

Attention...

jdxyw commented 4 years ago

look forward to it

awarebayes commented 4 years ago

Yes, I have already started to write an article. The article is scheduled this week. Dunno about the implementation, but it is coming soon. Already started working on the implementation, got a rough sketch here: https://gist.github.com/awarebayes/9c8a0f28e83a723549ce4ed0c4731997. In the article covered most of the paper already. Make sure to follow and not miss it: https://medium.com/@awarebayes. The code will be published here. carbon (6)

awarebayes commented 4 years ago

Just implemented OffPolicy correction, without Top K yet. But TopK is coming.... Link to the notebook

joseph-chan commented 4 years ago

Just implemented OffPolicy correction, without Top K yet. But TopK is coming.... Link to the notebook

Great job! I am paying attention to your recent work all the time, and your good job really give us a surprise

awarebayes commented 4 years ago

TL;DR: will keep it open until the article is published Just released Reinforce TopK! Here is the private article draft (not released yet): link. I was gonna send it to google folks to take a quick look at, so it is somewhat official. Also, set up all of the notebooks online.

Anyways, here is a quick look at policy loss with TopK Correction Screenshot

Without any correction: Screenshot

Notebook in the examples

joseph-chan commented 4 years ago

Code and detailed documentation already given in the previous reply of awarebayes, I closed the issue