-
I need that algorithm implemented here!!!
-
-
-
-
# Learning to play Yahtzee with Advantage Actor-Critic (A2C) | dionhaefner.github.io
My in-laws are really into the dice game Yatzy (the Scandinavian version of Yahtzee). If you’re unfamiliar with th…
-
Hi, I am new to tianshou and RL. I created a env and used ppo in tianshou to run. But I found the action sampling is out of range. So I searched for, and I found map_action. But it seem not used in tr…
-
This could be entirely due to my setup and any mods to get it runningn (but also posting in case anyone else runs into it), but the initial losses are NaN due to tensors being empty. During training t…
-
Comments for https://www.endpointdev.com/blog/2018/08/self-driving-toy-car-using-the-a3c-algorithm/
By Kamil Ciemniewski
To enter a comment:
1. Log in to GitHub
2. Leave a comment on this issue…
-
Here are my situation:
1. finished step 2 with cohere/zhihu_query dataset. The final reward score is 5.07, rejected score is 0.8, and the acc is 0.79. So the step 2 seems sucessful.
2. when I atte…
-
I want to make a project using reinforcement learning in which a bot send scam to other bots on social media, other bots detect the scam and reject it.
I think it needs a deep reinforcement learning…