policy-optimization Search Results

ZONG0004/MacroHFT #6

low-level policy optimization setup

Very interesting paper. Incredibly insightful. The paper specifically mentions: _" The coefficient 𝛼𝑙 of each sub-agent is tuned separately over {0, 1, 4} and selected based on the mean return …

jamesjjk updated 2 weeks ago

long8v/PTIR #187

[168] Proximal Policy Optimization Algorithms

[paper](https://arxiv.org/pdf/1707.06347) ## TL;DR - **I read this because.. :** 배경지식 차 - **task :** RL - **problem :** q-learning은 너무 불안정하고, trpo 는 상대적으로 복잡. data efficient하고 sclable한 arch…

long8v updated 3 weeks ago

long8v/PTIR #154

[142] Trust Region Policy Optimization

[paper](https://arxiv.org/pdf/1502.05477.pdf) ## TL;DR - **I read this because.. :** CS285 기말과제 - **task :** reinforcement learning - **problem :** 이론적으로 무조건 성능이 개선되는 policy update 방식이 있을까…

long8v updated 4 weeks ago

magenx/Magento-2-aws-cluster-terraform #38

Terraform issues on Lambda & media policy

I guess after adding the new changes for media optimization with lamda, terraform seems to fail due to some issues 1. Invalid Actions & Resources at : │ │ with aws_s3_bucket_policy.media, │…

ayoubeddafali updated 2 months ago

matrixorigin/matrixone #10208

[Tech Request]: Storage Optimization Policy Zoo

### Is there an existing issue for the same feature request? - [X] I have checked the existing issues. ### Is your feature request related to a problem? ```Markdown Storage Optimization Policy is e…

fengttt updated 2 months ago

matrixorigin/matrixone #11891

[Subtask]: implement zonemap-based optimization policy

### Parent Issue #10208 ### Detail of Subtask Implement zonemap-based optimization policy ### Describe implementation you've considered _No response_ ### Additional information _No response_

XuPeng-SH updated 2 months ago

DataCanvasIO/YLearn #56

Policy Optimization API Usage

Hi, I'm trying to do policy optimization using YLearn. I have read the docs about this but didn't understand the meaning very well. Formally, a policy optimization problem can be written as: $x^{*}=\t…

zhj2022 updated 7 months ago

CarperAI/trlx #504

Direct Policy Optimization

### 🚀 The feature, motivation, and pitch Hey all! Appreciate the work. Is there any word on whether DPO [(direct policy optimization)](https://arxiv.org/abs/2305.18290) will be integrated into the…

Reichenbachian updated 1 year ago

arXivTimes/arXivTimes #366

Proximal Policy Optimization Algorithms

## 一言でいうと Policy gradientは様々なタスクで利用されているが、戦略の更新幅の設定が難しく、小さいと収束が遅くなり大きいと学習が破綻する問題があった。そこで、TRPOという更新前後の戦略分布の距離を制約にするモデルをベースに、より計算を簡略化したPPOという手法を開発した。 ### 論文リンク https://openai-public.s3-us-west-…

icoxfog417 updated 7 years ago

d2l-ai/d2l-en #2424

Policy Optimization and PPO

Dear all, While the book currently has a small section on Reinforcement Learning covering MDPs, value iteration, and the Q-Learning algorithm, the book still does not cover an important family of a…

BrianPulfer updated 1 year ago

1000+ results for policy-optimization

1000+ results
for policy-optimization