actor-critic-algorithm Search Results

766 results
for actor-critic-algorithm

Best match

Best match Most commented Newest Recently updated Least commented Oldest Least recently updated

HanqingWangAI/SceneMover #1

Questions about experimental functions

Dear author, Thank you very much for making such a great project, it is very helpful to my research. But your codes and functions are too many, I don't know how to start, can you help me get starte…

baifanxxx updated 3 years ago
12
f205-ml-cv-lab/weekly-report-for-all-members- #1

2019-0609~2019-0615

ppcd401d2 updated 5 years ago
5
Tribler/tribler #2805

Incremental update of trust levels in a dynamic blockchain g…

Within Tribler we aim to calculate and show trust level of our neighbors. Trust levels evolve over time, as more information is coming in from our blockchain. Restarting the calculation from cratch…

synctext updated 2 months ago
49
Acellera/acegen-open #51

Issues using a custom pretrained Huggingface model (loading …

First, thanks for the amazing repository! I wanted to load a pretrained model from Huggingface, which typically creates a folder with the `config.json` and the .bin file containing the weights inside.…

hunklinger updated 2 months ago
5
isaac-sim/IsaacLab #905

[Bug Report] rsl-rl crashes for robots with mimic joints

Hi. Thank you for the great work. ### Describe the bug I am trying to train a virtual robot with multiple mimic joints. More specifically like this: https://github.com/KKSTB/isaac_lab_gundam_rob…

KKSTB updated 1 month ago
1
miyosuda/async_deep_reinforce #1

Problem while using the code

Hello @miyosuda, Thanks for sharing the code, please ignore the title, I tried out your code with the control problem of cartpole balance experiment instead of Atari game, it works well. But few ques…

originholic updated 7 years ago
78
mobeets/q-rnn #20

Beron with KL penalty

todo: add KL penalty between current and marginal policy as an intrinsic reward/penalty log π(a|s)/p(a) the question is if this will induce perseveration the only thing to figure out is how to …

mobeets updated 8 months ago
7
pytorch/torchtune #1395

[RFC] RLHF follow-ups

There are several optimizations to our PPO recipe which could help push it closer to SOTA in terms of performance. There are also several pieces of documentation we could offer alongside this recipe t…

SalmanMohammadi updated 5 days ago
1
Ipsedo/MARLClassification #10

Is the code improved based on the paper?

Also, is your code based on the paper with new modifications, the code involves A2C-like strategies that don't seem to be presented in the paper, which is a bit unclear to me. I hope you can help.

tjnkyqcy updated 2 months ago
4
Sohojoe/MarathonEnvsBaselines #6

Bad perfomances with ppo stable-baselines

Hello, We recently fixed a bug in the ppo2 implementation that should solve the performance gap observed ;) So I recommend you to update to latest version. Btw, I'm quite interested in your benchmar…

araffin updated 5 years ago
10

上一页 1...16 17 18 19 20 21 22...77 下一页

766 results for actor-critic-algorithm

766 results
for actor-critic-algorithm