in-context-reinforcement-learning Search Results

739 results
for in-context-reinforcement-learning

Best match

Best match Most commented Newest Recently updated Least commented Oldest Least recently updated

DLR-RM/stable-baselines3 #2008

[Feature Request] Safe Reinforcement Learning & Multi-Object…

### 🚀 Feature Some reinforcement problems, like [safe reinforcement learning](https://github.com/PKU-Alignment/safety-gymnasium/tree/main), require the environment to return multiple reward-like valu…

cherrywoods updated 1 month ago
2
xp1632/DFKI_working_log #64

LLM+Visual Programming_Technical Courses

- This issue focuses on the technical courses we take about LLM, we'll put the paper part in https://github.com/xp1632/DFKI_working_log/issues/70 --- 1. **ChainForge** https://chainforge.ai/ …

xp1632 updated 2 days ago
21
Farama-Foundation/Gymnasium #161

[Proposal] Add transitional probabilities to Taxi and Cliff …

### Proposal Only Frozen Lake in the toy text grid world environments implements transitional probabilities. Taxi is supposed to have it based on the previous documentation but has never been imp…

axb2035 updated 4 months ago
6
wordplaydev/wordplay #427

Lessons

## What's the problem? Wordplay, as a programming language and sharing platform, does little to help teachers structure, plan, and teach computing. Here are a small --- and likely incomplete --- li…

amyjko updated 1 month ago
25
irthomasthomas/undecidability #657

Finetuning LLMs for ReAct. Unleashing the power of finetunin…

- [ ] [Finetuning LLMs for ReAct. Unleashing the power of finetuning to… | by Pranav Jadhav | Feb, 2024 | Towards AI](https://pub.towardsai.net/finetuning-llms-for-react-9ab291d84ddc) # Finetuning L…

irthomasthomas updated 7 months ago
1
meta-introspector/meta-meme #187

AkashChat Model: Llama 3.1 405B New Language

Let's create a higher bandwidth compressed representation of our communication using a new language that we invent on the Fly that is emergent using emojis and text and mathematical symbols in a f…

jmikedupont2 updated 2 months ago
15
ManifoldRG/Manifold-KB #12

AF Survey - [Paper overview] ReAct: Reasoning and Acting wit…

🔍 **Deep Dive: "ReAct: Reasoning and Acting with Large Language Models"** 📝 **Summary:** The authors introduce "ReAct", a method that synergizes reasoning and acting in large language models (LL…

PranayPasula updated 1 year ago
2
brianpetro/jsbrains #3

FR Smart Chunks Improved handling of common headings

### Discussed in https://github.com/brianpetro/obsidian-smart-connections/discussions/416 Originally posted by **Levani307** January 16, 2024 Hello, I am a Smart Connection Supporter and I lov…

LeoLDLeo updated 5 months ago
5
tikankika/COSEAQ #1

Theory - phase1

Theory in Phase 1

tikankika updated 6 months ago
2
mlflow/mlflow #10477

[FR] Get the Only show diff button back

### Willingness to contribute No. I cannot contribute this feature at this time. ### Proposal Summary Set back the 'Only show diff' from the MLFlow 1 version. ### Motivation > #### What is the us…

ReHoss updated 10 months ago
3

上一页 1...9 10 11 12 13 14 15...74 下一页

739 results for in-context-reinforcement-learning

739 results
for in-context-reinforcement-learning