fused Search Results - Githubissues

1000+ results
for fused

Best match

Best match Most commented Newest Recently updated Least commented Oldest Least recently updated

triton-lang/triton #1934

triton is slower for fused attention

Hi Team, Noticed for different attention setup, i.e. different input batch_size, sequence_length, num_heads, heads_dim, from the demo in https://triton-lang.org/main/getting-started/tutorials/06-f…

LydiaXiaohongLi updated 2 months ago
5
OpenNMT/CTranslate2 #1809

Possible premature temporary removal of flash attention?

I was sifting through the cuDNN documentation and came across these snippets: "cuDNN BF16 and FP16 Fused Flash Attention now supports embedding dim = 256 use cases in forward propagation. Expand…

BBC-Esq updated 2 weeks ago
1
antipalindrome/Photoshop-Export-Layers-to-Files-Fast #135

Artboards fused

I don't think this script has multiple artboard support. I have 4 Artboards (6000x4000 pixels) each with 22 layers (88 layers total) and there is transparency. When I run the script it outputs 88 …

zwei7 updated 3 years ago
6
awslabs/graphstorm #1068

GraphBolt Training Log

When training with graphbolt, it will not check if the necessary files exist under graph directory. We should check if the fused embedding is under graph directory before going to the training stage.

jalencato updated 1 day ago
1
cognitivecomputations/grokadamw #5

How can I use this optimizer in trl ? Can you help me with a…

When I put in the trl library I get this error: is not a valid OptimizerNames, please select one of ['adamw_hf', 'adamw_torch', 'adamw_torch_fused', 'adamw_torch_xla', 'adamw_torch_npu_fused', 'adam…

usarth updated 1 month ago
1
flexflow/FlexFlow #569

Unify `fused.cpp` and `fused.cu`

These files are abstracted away from the hardware differences by the operators, so I see little reason to have both of them around. They should be unified so they don't go out of sync anymore.

lockshaw updated 1 year ago
1
tenstorrent/tt-metal #14533

[Feature Request] Backward function for LayerNorm

**Is your feature request related to a problem? Please describe.** We currently use `moreh_layer_norm_backward`, but it shows pretty bad performance. **Describe the solution you'd like** Fused o…

rfurko-tt updated 1 week ago
5
trinayan/multi2sim #3

fused errors

@amirkavyan Currently everything that is implemented for the fused part in 4.2 is implemented here. The same problem as in 4.2 . Most of the benchmarks don't work. Some of them run into a segfault and…

trinayan updated 7 years ago
1
pytorch/ao #752

[Feature Request] Fused fp8 matmul kernel (quant + dequant +…

Hey, team, AO provides awesome FP8 support with torch compile to get speed and memory improvement, however since torch compile is not always easily applicable for some models such as [MoE HF implement…

qingquansong updated 1 month ago
3
google-ai-edge/ai-edge-torch #305

failed to legalize operation 'tfl.pow' that was explicitly …

### Description of the bug: running this script https://github.com/johndpope/IMF/blob/main/tf-export-edge.py ```shell python tf-export-edge.py 2024-10-19 07:20:44.455948: I tensorflo…

johndpope updated 1 week ago
6

上一页 1...5 6 7 8 9 10 11...100 下一页

1000+ results for fused

1000+ results
for fused