-
Hello, thank you for this great work!
As I understand, RecSim is made for use with TF 1, because, if I'm not mistaken, we must provide a session to an agent. Do you plan to adapt interface for the…
-
학습이 되긴 하고 있는 건가 의심이 들어서 간단히 300번만 연습시켜 봄
RL: Deep Q-Learning with experience replay {epsilon: 0, discount rate: 0.95}
NN: {optimizer: Adam, loss function: MSE, activation layer: ReLU}
Player se…
-
https://xiang578.com/post/reinforce-learnning-basic.html
Info 课件下载:Hung-yi Lee - Deep Reinforcement Learning 课程视频:DRL Lecture 1: Policy Gradient (Review) - YouTube Change Log 20191226: 整理 PPO 相关资…
-
Following `pytorch_inference.ipynb` steps, created a ipynb in `cd SageMaker/amazon-sagemaker-tsp-deep-rl` but got
`ModuleNotFoundError: No module named 'problems'`
Cloned `https://github.com/chaitjo…
-
I have seen that https://gradio.app/ is used in the UIs for Hugging Face. @wang-boyu have you looked into it, since it is listed in one of the possible frameworks to use in the GSoC wiki?
See also ht…
-
**Read the paper and answer those questions and document the results on wiki**
- What is the exact definition of the problem?
- What are inputs and outputs (for each timestep vs. for each game)
- How …
-
### 🐛 Bug
I am developing a custom Feature Extractor Type (based on DeepSets) for SB3 and want to train + optimize it with sb3_zoo. For it I add the following to a custom config.py file:
```pyth…
-
Hello Baris!
Great work on your master thesis! I am doing similar work for my master thesis and i am using some of your work as inspiration!
We are using mujoco and Robosuite, and I am therefore not…
-
Hello, I have a quick question.
I know most RLHF structure use KL divergence.
https://github.com/nebuly-ai/nebullvm/blob/aad1c09ce20946294df3ec83569bad9496f58d0e/apps/accelerate/chatllama/chatllam…
-
Hi author,
Thanks very much for your great work!
-I'm following the online training instruction in Deep_Learning_Readme.md, but when I finished the third step(start the UDT side), I couldn’t see…