actor-critic Search Results

1000+ results
for actor-critic

Best match

Best match Most commented Newest Recently updated Least commented Oldest Least recently updated

CreativeNick/SimToReal #1

[Bug/Error] Concatenating EMPTY array "returns" during first…

### Issue When attempting to run `ppo.py` to train the RL model using on `cube_env.py` or the **Bimanual_Allegro_Cube** env, I get an _empty array error_ during Epoch 1 of the iteration loop in `ppo.…

CreativeNick updated 5 months ago
2
rms-support-letter/rms-support-letter.github.io #6078

What can we improve from now on?

Things are calming down, but we have two communities from now on. One denounces RMS, other doesn't. There's still little communication and there's a bit of conflict when somebody doesn't want to list…

6r1d updated 3 years ago
71
vllm-project/vllm #5805

[Roadmap] vLLM Roadmap Q3 2024

Update: * Please see #6801 for major items in performance sprint. * Please see #8779 for major items in a new architecture aim at simplicity and performance. * We are in the feedback gathering pha…

simon-mo updated 1 month ago
42
brienzb/toy-box #111

[GTRP] Google Trend Report (2024-05-21)

In time for each country at 10 p.m., Google Trent Keyword Report will be published with this issue comment. - Korea - France - Swiss - UK (United Kingdom) - US (United States of America)

brienzb updated 6 months ago
5
JeffersonLab/SciOptControlToolkit #1

March 2024 Release

- [x] Added DDPG - [x] Reworked TD3 - [x] Registered buffers w/ command line arguments or config file control - [x] Prioritized experience replay buffer added - [x] Uniform random experience repl…

armenkasp updated 7 months ago
9
prettier/prettier #4801

formatWithCursor performance bottleneck

I've isolated a bottleneck from our production environment and here's a nifty self-contained benchmark for it: https://gist.github.com/tmcw/1a4e8ee47941454337dc5952dbf90180 (swap require('./') for req…

tmcw updated 11 months ago
15
epicgamer17/rl-research #46

make ppo actor and critic into one network (that just has to…

epicgamer17 updated 8 months ago
1
YifeiZhou02/ArCHer #5

llama2-7B训练webshop效果越来越差了

您好，不好意思打扰到您。我用我们的代码去训练webshop，效果变的越来越差。我们先用2000条webshop数据训练了一个LoRa，之后在这个LoRa基础上训练llama2-7B。我们的测试方法是：用200个webshop对话做测试，测试metrics是ngrams(n=2)，初始化的LoRa得分是129.177，训练2000次迭代后68.546。训练参数是 `# Adversar…

xiaxiaxiatengxi updated 7 months ago
1
zerosansan/td3_ddpg_sac_dqn_qlearning_sarsa_mobile_robot_navigation #3

Error after launching "roslaunch turtlebot3_rl_sim start_td3…

Hello sir, I am getting the below after launching "roslaunch turtlebot3_rl_sim start_td3_training.launch". I tried to solve it many times. please help me to resolve this issue, sir. **error:** use…

vamsi8106 updated 5 months ago
5
ray-project/ray #14878

[rllib] SAC numerical instability

### What is the problem? SAC calculates the gaussian log probability based on clamped values, which can result in very large values if the tanh saturates and as a consequence result in explodin…

dHonerkamp updated 5 months ago
1

上一页 1...93 94 95 96 97 98 99...100 下一页

1000+ results for actor-critic

1000+ results
for actor-critic