actor-critic Search Results

1000+ results
for actor-critic

Best match

Best match Most commented Newest Recently updated Least commented Oldest Least recently updated

microsoft/DeepSpeedExamples #592

what(): CUDA error: an illegal memory access was encounte…

terminate called after throwing an instance of 'c10::Error' what(): CUDA error: an illegal memory access was encountered Compile with `TORCH_USE_CUDA_DSA` to enable device-side assertions. Exc…

qinzhiliang updated 10 months ago
8
microsoft/DeepSpeedExamples #482

A100 40 GB: OOM on step-3 for opt-6.7B

Hi, I managed to train step 1 and step 2 for a 6.7B actor model and 350m reward model but I keep running into an out of memory issue for step 3. I was wondering what config was used in your tests with…

akashsaravanan-georgian updated 1 year ago
4
clearlab-sustech/Wheel-Legged-Gym #3

VAE的作用

大佬您好: 我想请教一下vae在actor-critic网络中间起到的作用是什么,去掉之后会怎么样?

Li-Jinyuan updated 5 months ago
1
tensorflow/agents #507

SAC for discrete action (GumbelSoftmax reparameterization tr…

Hello, I need to make SacAgent work with discrete action, so try to implement GumbelSoftmax parameterization trick by re-defining the relevant classes. However, the calculation of `agent.train(experie…

Kang-SungKu updated 1 year ago
11
quantylab/rltrader #14

학습모델 질문

사용한 Policy Gradient의 명칭을 알려주실수 있나요? actor-critic 방식이 아닌것같아 질문드립니다.

ghdrl95 updated 5 years ago
1
datawhalechina/easy-rl #146

关于书中DDPG算法的疑问

在运行书中DDPG的参考代码时，观察到随着训练的进行，actorloss 的值在不断的上升，criticloss的值也是上下飘忽不定 actorloss 的定义不是-q值吗？现在这个值越来越大不就意味着动作的q值越来越小吗？这不是与我们想最大化动作的q值的目的相反吗？所以评价这个算法的好坏最终是要看他的奖励是否上升吗？

yxz777 updated 2 months ago
1
tensorflow/agents #508

ValueError: Inputs to TanhNormalProjectionNetwork must match…

Anyone know what's wrong of using tf-agent here which trigger the ValueError? ValueError: Inputs to TanhNormalProjectionNetwork must match the sample_spec.dtype. In call to configurable 'SacAgent'…

lamhoson updated 4 years ago
1
DeepX-inc/machina #193

TD3

* paper Addressing Function Approximation Error in Actor-Critic Methods https://arxiv.org/abs/1802.09477

takerfume updated 5 years ago
1
microsoft/DeepSpeedExamples #330

single gpu 6.7b lora CUDA OOM with A6000 48G

I am trying to run DeepSpeed-Chat Example with single gpu, Nvidia A6000 48G. I could run all 3 steps well using 1.3b example. But when I run `single_gpu/run_6.7b_lora.sh`, I got CUDA Out Of Memory…

HyeongminMoon updated 1 year ago
4
csingh27sewts/Masterarbeit #62

Check why the cloth is pushed away towards the edge and solv…

csingh27 updated 3 years ago
5

上一页 1...14 15 16 17 18 19 20...100 下一页

1000+ results for actor-critic

1000+ results
for actor-critic