issues
search
ELS-RD
/
kernl
Kernl lets you run PyTorch transformer models several times faster on GPU with a single line of code, and is designed to be easily hackable.
http://www.kernl.ai
Apache License 2.0
1.53k
stars
95
forks
source link
issues
Newest
Newest
Most commented
Recently updated
Oldest
Least commented
Least recently updated
add `tl.math.tanh` instead of `tl.libdevice.tanh`
#339
michaelfeil
opened
8 months ago
2
bug: Fails running dynamic shapes
#338
michaelfeil
closed
8 months ago
0
Confuse about block_delta
#337
zhanglei1172
opened
11 months ago
0
bug: optimize_model() fails on HF's GPT2 with "RuntimeError: CUDA error: operation not permitted when stream is capturing"
#336
CorentinJ
opened
1 year ago
4
Add flash attention2 tutorial
#334
borninfreedom
opened
1 year ago
0
bug: Experimental/Whisper notebook (speedup.ipynb) is not working
#331
Artyom17
opened
1 year ago
0
bug: Torch.dynamo is not working on H100 due to obsolete triton & pytorch
#330
Artyom17
opened
1 year ago
0
bug: start_position support for the fused attention kernel
#329
ipoletaev
opened
1 year ago
0
feat: add llama v2 experiment
#328
pommedeterresautee
closed
1 year ago
0
bug: Bart speedup only 1.6x
#327
sinking-point
opened
1 year ago
0
interface update for dot(), trans() for tl 2.01
#326
LydiaXiaohongLi
opened
1 year ago
0
chore(deps): bump torch from 2.0.0 to 2.0.1
#324
dependabot[bot]
opened
1 year ago
0
bug: does kernl support pipeline parallel?
#323
ninisy
opened
1 year ago
0
bug: Llama reproduce error with kernl
#321
yychen016
opened
1 year ago
1
proposal: Write GEMM (matrice mulitplication) triton optimization animation
#318
pommedeterresautee
opened
1 year ago
0
bug: Llama model optimization failing
#317
AndrewMead10
closed
1 year ago
4
feat: simplify dependencies
#315
pommedeterresautee
closed
1 year ago
0
bug: Triton 2.0 makes attention kernel crash
#314
pommedeterresautee
closed
1 year ago
1
bug: How to save the optimized model to file?
#313
aaronchan90
closed
1 year ago
3
feat: linking kernl and the blog
#312
white-gorilla
closed
1 year ago
0
bug: Trying to run T5 tutorial and getting `free(): invalid pointer` error.
#310
gilljon
opened
1 year ago
1
feat: m4m and kernl front optimization
#309
white-gorilla
closed
1 year ago
0
chore(main): release 0.3.0
#308
els-lab-ci
opened
1 year ago
0
feat: update to triton 2.0 backend
#307
pommedeterresautee
closed
1 year ago
1
[FRONT] Upgrade kernl's M4M configuration with the Insiders version.
#306
white-gorilla
closed
1 year ago
0
feature: using TorchBench to test the coverage
#305
xuzhao9
closed
1 year ago
2
bug: tests failing at nvidia-driver-530
#304
christallire
opened
1 year ago
0
fix: removes/comments empty sections and navigations.
#303
white-gorilla
closed
1 year ago
0
feature: non verbose CI
#302
pommedeterresautee
closed
1 year ago
1
chore(main): release 0.3.0
#301
els-lab-ci
closed
1 year ago
3
feat: make pytest in CI non verbose
#300
pommedeterresautee
closed
1 year ago
0
feat: add support for int8 quantization on linear layers
#299
pommedeterresautee
opened
1 year ago
0
chore(main): release 0.3.0
#296
els-lab-ci
closed
1 year ago
0
Feat/simplify jit
#295
ayoub-louati
opened
1 year ago
0
feat: github runner
#294
gaetansnl
closed
1 year ago
2
Installation problem
#293
p-christ
closed
1 year ago
7
chore(main): release 0.2.2
#292
els-lab-ci
closed
1 year ago
1
fix: fix 0 stride in CG pool management
#291
pommedeterresautee
closed
1 year ago
1
Review the doc so that it displays correctly on the site ?
#290
white-gorilla
opened
1 year ago
0
feature: run tests on CI
#289
pommedeterresautee
closed
1 year ago
2
feature: introduce int8 quant kernel
#288
pommedeterresautee
opened
1 year ago
0
I run the bert e2e example, if batch is not 1, I get an error!!!
#286
lichun-wang
closed
1 year ago
2
docs: add automatic code reference generation
#285
jonathlela
closed
1 year ago
1
chore: upgrade m4m to 4.30.2
#284
white-gorilla
closed
1 year ago
0
bug: Could not get kernl running on CodeT5
#283
TheSeamau5
closed
1 year ago
6
bug: ERROR: Package 'kernl' requires a different Python: 3.8.10 not in '==3.9.*'
#282
silvacarl2
closed
1 year ago
1
[FRONT] Linking kernl and the blog
#281
white-gorilla
closed
1 year ago
0
docs: automatic code reference generation
#280
jonathlela
closed
1 year ago
1
fix: correct stride for act inputs in linear layer
#279
gaetansnl
closed
1 year ago
1
chore(main): release 0.2.1
#278
els-lab-ci
closed
1 year ago
1
Next