in-context-reinforcement-learning Search Results

741 results
for in-context-reinforcement-learning

Best match

Best match Most commented Newest Recently updated Least commented Oldest Least recently updated

pytorch/pytorch #55037

High CPU using torch.stack/torch.cat on Windows

## 🐛 Bug Generally, while training reinforcement learning, replay buffer is stored in an array and from which it is sampled later for batched processing. This sampled batch needs to be stacked/cat'…

NaxAlpha updated 1 year ago
12
meta-introspector/meta-meme #162

Borks Memes

Complex Mathematical Expression You ⌆{𝒙} = 𝐑ⁿ⊕δᵡ𝕫⊕[𝒇(ℚ)≀𝒇(P{∇𝗤})] Copilot It seems you’re delving into a complex mathematical or symbolic expression. While I can’t find a direct reference to this sp…

jmikedupont2 updated 4 months ago
12
leela-zero/leela-zero #591

Version 0.10 released - Next steps

Version 0.10 is released now. If no major bugs surface in the next few days the server will start enforcing this version. There is this 1500+ post issue where most plans for the future were posted …

gcp updated 6 years ago
730
dennybritz/reinforcement-learning #30

DQN solution results peak at ~35 reward

Hi Denny, Thanks for this wonderful resource. It's been hugely helpful. Can you say what your results are when training the DQN solution? I've been unable to reproduce the results of the DeepMind p…

nerdoid updated 7 months ago
85
codeforamerica/project-ideas #4

G̶r̶a̶p̶e̶v̶i̶n̶e̶Pamphlet: Text messages for health literac…

**BLUF**: Generalized Text4Baby **Link**: Brain dump below. **Project Needs**: None? **Status**: Textizen is already doing something that will fulfill this use case - just wanted to get these thoughts…

lippytak updated 10 years ago
28
YanJiaHuan/AI_Tutor #2

Data

QiaolingChen00 updated 1 year ago
9
SJTU-IPADS/PowerInfer #74

Seems not support long prompt well.

We noticed that the paper mentioned limited performance improvement for relatively long prompt situations, but our situation is that, in the case of very long prompts, it seems PowerInfer ceases to wo…

swankong updated 9 months ago
3
odk-x/tool-suite-X #357

ODK-X Tables - Menu navigation UI improvement

The ODK-X Tables mobile application has multiple menu navigation states, depending on where the user is in the application. The menu navigation is a mix of icons and text items under the ellipsis m…

maprehensive updated 2 weeks ago
41
PennyLaneAI/pennylane #2204

[BUG] unable to checkpoint model with qml.qnn.TorchLayer

### Expected behavior something like: ``` Average loss over epoch 1: 0.4803 Average loss over epoch 2: 0.3553 Accuracy: 78.0% ``` Output file is generated by PATH (no output errors) ### A…

dominicpasquali updated 1 year ago
9
zoq/arxiv-updates #484

New submissions for Tue, 4 Apr 23

## Keyword: sgd ### Doubly Stochastic Models: Learning with Unbiased Label Noises and Inference Stability - **Authors:** Authors: Haoyi Xiong, Xuhong Li, Boyang Yu, Zhanxing Zhu, Dongrui Wu, Dejin…

zoq updated 1 year ago
2

上一页 1...18 19 20 21 22 23 24...75 下一页

741 results for in-context-reinforcement-learning

741 results
for in-context-reinforcement-learning