attention-is-all-you-need Search Results

1000+ results
for attention-is-all-you-need

Best match

Best match Most commented Newest Recently updated Least commented Oldest Least recently updated

FlowiseAI/Flowise #3441

[BUG] SqlDatabaseChain does not work; it seems to build the …

**Describe the bug** SqlDatabaseChain node, when you pass in a query, simply executes the wrong query with error message: ```QueryFailedError: syntax error at or near "SQLQuery"``` and in the logs…

hchor updated 1 day ago
1
espnet/espnet #5945

Output from encoder-decoder

I need to analyze and present visually the training process and the outcome of the encoder-decoder training and compare this with input features. I use 'pytorch' backend process. How can I tap the net…

cgbhat1978 updated 3 hours ago
4
shapeshift/og #29

[Spike] broker -- hand rolled vs as a service

Chainflip has a thing --poorly named-- called a broker that does (@0xean fills in sophsticated details) we should probably run our own but there are contingencies we'll need: - a cycling amount…

twblack88 updated 2 weeks ago
1
abdalmoniem/Caffeinate #45

make the function of the "next timeout" button more naturall…

one thing that bugs me: the notification is really nice and well thought out... but when using the button "next timeout" it uses the next higher timeout of the **initial** timeout, no mater what the r…

DJCrashdummy updated 2 days ago
3
pytorch/torchtune #1244

[RFC] Long context fine tuning in torchtune

# Goal ------ * Many of the new LLMs models support long context. For example, lamma 3.1 and Mistral 2 support 128k; * The trend is upwards, e.g. Gemini support 1M - 10M. Claude supports 200k; * …

felipemello1 updated 1 week ago
17
rust-lang/rust #132386

[rustdoc] issues of the three-big-buttons

PR #129545 introduced a new style for rustdoc API pages. I appreciate the author's efforts. But the new style still has a few shortcomings. - It's not as compact as the old style, more than one line…

liigo updated 1 week ago
2
pytorch/ao #1071

[RFC] LowBit Fused Attention

## Current State of OSS FP8 Operators So far, all examples of fp8 ops (compute in fp8) are scaled matmuls that accumulate in a higher precision type. In fact, there are really only 2 classes of in…

drisspg updated 3 weeks ago
2
arcadelab/FastSAM3D #6

Difference between checkpoint FASTSAM3D and Finetuned-SAMMED…

Thank you for sharing the excellent code and checkpoints! I have run the code described in `Readme.md` and would like to determine whether I correctly understood them. The current version of `dist…

MinxuanQin updated 1 month ago
9
NVIDIA/cutlass #1838

[QST] Where is FlashAttention-2 CUTLASS kernel

Hello, I'am study fused_multi_head_attention example in CUTLASS. In CUTLASS 3.5.1 README.md, it said flash attention 2 kernel is in CUTLASS. But in fused_multi_head attention, it is based on Meta/xFor…

yoon5862 updated 1 week ago
5
AlexisTercero55/AI-Research #10

Research | Positional encoding

# Positional encoding From paper _Attention is all you need_ is required to implement this feature in order con contibute to the **_Transformers_** milestone. ## Refferences * [Attention Is A…

AlexisTercero55 updated 8 months ago
1

上一页 1...5 6 7 8 9 10 11...100 下一页

1000+ results for attention-is-all-you-need

1000+ results
for attention-is-all-you-need