offline-reinforcement-learning Search Results

rl-tools/rl-tools #7

Use PC training results on Microcontroller

Hi dears. At the first, appreciation ur Solutions for Reinforcement Learning. One critical question for me ! Could I do my RLtools Model training phase on the PC (Linux OS) & use these results as an…

ShHasanzadeh updated 2 weeks ago

arXivTimes/arXivTimes #1824

Hyperparameter Selection for Offline Reinforcement Learning

## 一言でいうとオフライン強化学習のハイパーパラメーター(hp)に対する頑健性を調査した研究。基本的な模倣学習手法Behavior Cloningと近年の手法であるCRR/D4PGの3つを特定レンジのhpで評価。hpによるばらつきは大きいが(概ねOver Estimateする傾向がある)、戦略固定の価値関数更新を行うことで影響を軽減できる。 ### 論文リンク https:/…

icoxfog417 updated 4 years ago

arXivTimes/arXivTimes #1658

An Optimistic Perspective on Offline Reinforcement Learning

## 一言でいうと学習済みエージェントの行動履歴から学習するOffline強化学習の研究。Offline(新しいデータが取れない)状態で汎化させるため、複数エージェントの価値予測をランダムにアンサンブルして予測を行う(Random Ensemble Mixture)。これにより元エージェントを上回る性能を獲得。強化学習版蒸留ともいえる。 ### 論文リンク https://ar…

icoxfog417 updated 4 years ago

polixir/OfflineRL #4

When I run the example. I have an RuntimeError: mat1 and mat…

When I run the command python examples/train_task.py --algo_name=mopo --exp_name=halfcheetah --task HalfCheetah-v3 --task_data_type low --task_train_num 2 It shows : ``` File "examples/train_…

lk1983823 updated 2 years ago

arXivTimes/arXivTimes #2019

Offline Reinforcement Learning: Tutorial, Review, and Perspe…

## 一言でいうと収集済みのサンプルを利用するオフライン強化学習のチュートリアル資料。解説を始める前に、まずオフライン強化学習が有効に働くシチュエーションが述べられており学習のゴールがイメージできるようなっている(人間相手で多数の試行が困難な医療や対話が挙げられている)。 ### 論文リンク https://arxiv.org/abs/2005.01643 ### 著者/…

icoxfog417 updated 3 years ago

cis3296f24/applebaum-final-project-section-005-applebaum #11

3D Chess Gaming

keywords Section # 005, Java, AI, 3D Chess Game, JavaFX, Blender ### Project Abstract This project proposes the development of an AI-powered 3D chess game that allows users to play onlin…

Gunlords updated 4 weeks ago

huggingface/lerobot #504

Porting HIL-SERL

# HIL-SERL in LeRobot --- On porting [HIL-SERL](https://hil-serl.github.io/) to LeRobot. This page will outline the minimal list of components and tasks that should be implemented in the LeRobot c…

michel-aractingi updated 2 weeks ago

Farama-Foundation/D4RL #190

Environment hopper-medium doesn't exist

### Question When I want to Rerun the code of "Conservative Q-Learning for Offline Reinforcement Learning", wo got a problem that "gym.error.NameNotFound: Environment hopper-medium doesn't exist. …

rdgy2017 updated 1 year ago

hakuhodo-technologies/scope-rl #25

Query on Handling Offline Data with scope-rl

I am working in the field of reinforcement learning research, particularly in medical applications. My inquiry is about using pre-collected offline data (encompassing state, action, next state, an…

jupitersh updated 9 months ago

Farama-Foundation/D4RL #47

Discrepancy between results reported in CQL and D4rl papers

Hi, I notice there are differences between results reported in CQL paper and D4RL paper for this benchmark. Since some of the authors are common for both papers, can you please comment which of tho…

rasoolfa updated 3 years ago

226 results for offline-reinforcement-learning

226 results
for offline-reinforcement-learning