openai-triton Search Results

1000+ results
for openai-triton

Best match

Best match Most commented Newest Recently updated Least commented Oldest Least recently updated

triton-lang/triton #2743

Convert triton tutorial into LLVM IR and Documentations

Hello Team, 1. How to convert python implementation of triton code given at https://github.com/openai/triton/tree/main/python/tutorials to LLVM IR or other transformation. 2. Do we have any langua…

Jac1494 updated 7 months ago
1
triton-lang/triton #2267

[RFC] Goal for trition.ops.flash_attention

Hi! The [flash attention implementation](https://github.com/openai/triton/blob/main/python/triton/ops/flash_attention.py) is really helpful as a reference. I noticed that the code currently makes some…

EPronovost updated 10 months ago
3
ModelTC/lightllm #239

[BUG] Qwen-7B-Chat AttributeError: 'LlamaSplitFuseInferStat…

**Before you submit an issue, please search for existing issues to avoid duplicates.** **Issue description:** AttributeError: 'LlamaSplitFuseInferStateInfo' object has no attribute 'logn_values' …

exceedzhang updated 7 months ago
4
2lambda123/openai-triton #4

🧚🤖 Pixeebot Activity Dashboard

👋 This dashboard summarizes my activity on the repository, including available improvement opportunities. ## Recommendations _Last analysis: Jun 15 | Next scheduled analysis: Jun 22_ ### Open - h…

pixeebot[bot] updated 1 month ago
3
pytorch/pytorch #125557

result of 2 ** s differs between eager mode and inductor + t…

### 🐛 Describe the bug When I calculate 2.0 ** s on cuda for very small s so that the result is in the float32 denormal range, the result from PT eager mode is the correct denormalized floating point…

vkuzo updated 2 months ago
1
vllm-project/vllm #2729

Assertion `!(srcMmaLayout && dstMmaLayout) && "Unexpected mm…

When executing script `examples/offline_inference_with_prefix.py`, it will call `context_attention_fwd` from `vllm.model_executor.layers.triton_kernel.prefix_prefill`, which triggered the following er…

gty111 updated 2 days ago
16
triton-lang/triton #3212

tl.dot for matrix size 32x8x16 (m-n-k)

May tl.dot support mma 32x8x16 (m-n-k) which is supported by tensor core? In the process of developing operators with Triton, it's essential to minimize the N dimension of blocks as much as possibl…

Begunner updated 5 months ago
1
triton-lang/triton #3166

Kernel launching overhead with `jit`

Hello triton team, I did a quick profiling on the triton matmul kernel https://github.com/openai/triton/blob/main/python/triton/ops/matmul.py using pytorch profiler. ![image](https://github.com/ope…

HeyangQin updated 3 months ago
2
vllm-project/vllm #4514

[Bug]: For RDNA3 (navi31; gfx1100) VLLM_USE_TRITON_FLASH_ATT…

### Your current environment ```text Collecting environment information... /opt/conda/envs/py_3.9/lib/python3.9/site-packages/torch/cuda/__init__.py:611: UserWarning: Can't initialize NVML warni…

lhl updated 2 weeks ago
12
triton-lang/triton #2260

[TESTS] Add tests for `matmul` with `out_dtype=float16`

Currently, it is always `None` which defaults to `float32` https://github.com/openai/triton/blob/f21b36c8c54f35a88e96d7217e2c6bc9cc02ee69/python/test/unit/operators/test_matmul.py#L179 I believe…

jon-chuang updated 10 months ago
1

上一页 1...3 4 5 6 7 8 9...100 下一页

1000+ results for openai-triton

1000+ results
for openai-triton