nod-ai sharktank issues

nod-ai / sharktank

SHARK Inference Modeling and Serving

Apache License 2.0

7 stars 7 forks source link

issues

Newest

Newest Most commented Recently updated Oldest Least commented Least recently updated

[Discussion]: Quantization model format and representation

#94 luow-amd opened 20 hours ago
1
[punet] Adapt to brevitas quant_params.json change.

#93 stellaraccident opened 5 days ago
0
Add sharding of Punet's up/down block

#92 sogartar closed 4 days ago
0
Comment out punet golden checks until accuracy is validated.

#91 ScottTodd closed 6 days ago
0
Add CI job for punet tests, running nightly.

#90 ScottTodd closed 6 days ago
0
Add quantization docs.

#89 stellaraccident closed 1 week ago
0
[quant] When broadcasting the weight of a bmm, broadcast then ext.

#88 stellaraccident closed 1 week ago
0
[kernel] Fix a bad static/dynamic cast in batch_matmul_transpose_b.

#87 stellaraccident closed 1 week ago
0
[quant] Convert custom conv op to mixed precision.

#86 stellaraccident closed 1 week ago
0
[quant] Dedynamize the integer kernels.

#85 stellaraccident closed 1 week ago
0
[punet] Add integration tests.

#84 stellaraccident closed 1 week ago
0
Add sharded interpolate

#83 sogartar closed 1 week ago
0
Flaky test sharktank\tests\types\dataset_test.py::ThetaTest::testTransform

#82 ScottTodd opened 1 week ago
0
Add sharding of ResnetBlock2D

#81 sogartar closed 1 week ago
0
[model] Avoid const folding extsi ops on weights

#80 antiagainst closed 1 week ago
3
Add some slicing cases for sharded tensors

#79 sogartar closed 1 week ago
0
Add sharding spec for conv 2D layer

#78 sogartar closed 1 week ago
0
Insert padding for conv and pooling ops before bit-width extend operations of operands

#77 aviator19941 closed 1 week ago
0
[punet] CI for quantization import/compilation/golden check

#76 stellaraccident opened 1 week ago
2
[punet] Evaluate rescale efficiency

#75 stellaraccident opened 1 week ago
0
[punet] Implement fp8 attention kernel

#74 stellaraccident opened 1 week ago
0
[sharktank] Integrate punet model into sdxl pipline

#73 stellaraccident opened 1 week ago
0
Update VGPR info in the kernel optimization guide

#72 kuhar closed 1 week ago
0
Guard the testing facility used to verify last op dispatch.

#71 stellaraccident closed 1 week ago
0
Add test script and workflow for llama export, compile, serve.

#70 ScottTodd closed 1 week ago
1
Add docs and more logging for llama export, compile, run.

#69 ScottTodd closed 1 week ago
0
[punet] Add direct to linalg integer kernels for mmt, conv, pooling sum.

#68 stellaraccident closed 1 week ago
0
Add sharding specs

#67 sogartar closed 1 week ago
0
Add initial amdgpu kernel optimization guide

#66 kuhar closed 1 week ago
1
Disentangle the sharded and split notions

#65 sogartar closed 1 week ago
0
Fix quantizers_test on Windows.

#64 ScottTodd closed 1 week ago
1
Expand sharded element-wise binary ops support

#63 sogartar closed 1 week ago
1
Add a few cases of sharded matmul

#62 sogartar closed 2 weeks ago
0
[punet] Switch weight quantization to signed.

#61 stellaraccident closed 2 weeks ago
0
Make matmul's transpose_rhs default to False and use linear instead

#60 sogartar closed 2 weeks ago
0
Add shareded unreduced tensor type

#59 sogartar closed 2 weeks ago
0
Integer kernels: Matmul_per_axis_q8, attention_per_channel_q8, batch_matmul, convolution

#58 KyleHerndon opened 2 weeks ago
1
Sharded unreduced tensor

#57 sogartar closed 2 weeks ago
1
transpose_rhs argument in matmul

#56 sogartar closed 2 weeks ago
5
qconv testInputAsymPerChannel_WeightAsymPerChannel_NoBias fails

#55 stellaraccident opened 3 weeks ago
0
Add replicated tensor type and handle conv2d with sharded input channels

#54 sogartar closed 2 weeks ago
4
Add linear quantized conv layer.

#53 stellaraccident closed 3 weeks ago
0
[q_impls] Use int32 dtype for linear offset correction

#52 aviator19941 closed 3 weeks ago
0
[punet] Update quantizer to allow explicit mixed precision rescale.

#51 stellaraccident closed 3 weeks ago
0
Add sharding of conv2d, group norm and layer norm

#50 sogartar closed 3 weeks ago
1
Add some features that make it easier to do A/B comparison on punet.

#49 stellaraccident closed 3 weeks ago
0
Iterate on quantized linear and conv layers.

#48 stellaraccident closed 3 weeks ago
0
ROCm requirements fail install with python3.11

#47 sogartar closed 1 week ago
2
Implement quantization import for punet model.

#46 stellaraccident closed 1 month ago
1
Disable mmtfp op as it is no longer needed and implicated in some issues.

#45 stellaraccident closed 1 month ago
0