issues
search
nod-ai
/
sharktank
SHARK Inference Modeling and Serving
Apache License 2.0
7
stars
7
forks
source link
issues
Newest
Newest
Most commented
Recently updated
Oldest
Least commented
Least recently updated
[Discussion]: Quantization model format and representation
#94
luow-amd
opened
20 hours ago
1
[punet] Adapt to brevitas quant_params.json change.
#93
stellaraccident
opened
5 days ago
0
Add sharding of Punet's up/down block
#92
sogartar
closed
4 days ago
0
Comment out punet golden checks until accuracy is validated.
#91
ScottTodd
closed
6 days ago
0
Add CI job for punet tests, running nightly.
#90
ScottTodd
closed
6 days ago
0
Add quantization docs.
#89
stellaraccident
closed
1 week ago
0
[quant] When broadcasting the weight of a bmm, broadcast then ext.
#88
stellaraccident
closed
1 week ago
0
[kernel] Fix a bad static/dynamic cast in batch_matmul_transpose_b.
#87
stellaraccident
closed
1 week ago
0
[quant] Convert custom conv op to mixed precision.
#86
stellaraccident
closed
1 week ago
0
[quant] Dedynamize the integer kernels.
#85
stellaraccident
closed
1 week ago
0
[punet] Add integration tests.
#84
stellaraccident
closed
1 week ago
0
Add sharded interpolate
#83
sogartar
closed
1 week ago
0
Flaky test sharktank\tests\types\dataset_test.py::ThetaTest::testTransform
#82
ScottTodd
opened
1 week ago
0
Add sharding of ResnetBlock2D
#81
sogartar
closed
1 week ago
0
[model] Avoid const folding extsi ops on weights
#80
antiagainst
closed
1 week ago
3
Add some slicing cases for sharded tensors
#79
sogartar
closed
1 week ago
0
Add sharding spec for conv 2D layer
#78
sogartar
closed
1 week ago
0
Insert padding for conv and pooling ops before bit-width extend operations of operands
#77
aviator19941
closed
1 week ago
0
[punet] CI for quantization import/compilation/golden check
#76
stellaraccident
opened
1 week ago
2
[punet] Evaluate rescale efficiency
#75
stellaraccident
opened
1 week ago
0
[punet] Implement fp8 attention kernel
#74
stellaraccident
opened
1 week ago
0
[sharktank] Integrate punet model into sdxl pipline
#73
stellaraccident
opened
1 week ago
0
Update VGPR info in the kernel optimization guide
#72
kuhar
closed
1 week ago
0
Guard the testing facility used to verify last op dispatch.
#71
stellaraccident
closed
1 week ago
0
Add test script and workflow for llama export, compile, serve.
#70
ScottTodd
closed
1 week ago
1
Add docs and more logging for llama export, compile, run.
#69
ScottTodd
closed
1 week ago
0
[punet] Add direct to linalg integer kernels for mmt, conv, pooling sum.
#68
stellaraccident
closed
1 week ago
0
Add sharding specs
#67
sogartar
closed
1 week ago
0
Add initial amdgpu kernel optimization guide
#66
kuhar
closed
1 week ago
1
Disentangle the sharded and split notions
#65
sogartar
closed
1 week ago
0
Fix quantizers_test on Windows.
#64
ScottTodd
closed
1 week ago
1
Expand sharded element-wise binary ops support
#63
sogartar
closed
1 week ago
1
Add a few cases of sharded matmul
#62
sogartar
closed
2 weeks ago
0
[punet] Switch weight quantization to signed.
#61
stellaraccident
closed
2 weeks ago
0
Make matmul's transpose_rhs default to False and use linear instead
#60
sogartar
closed
2 weeks ago
0
Add shareded unreduced tensor type
#59
sogartar
closed
2 weeks ago
0
Integer kernels: Matmul_per_axis_q8, attention_per_channel_q8, batch_matmul, convolution
#58
KyleHerndon
opened
2 weeks ago
1
Sharded unreduced tensor
#57
sogartar
closed
2 weeks ago
1
transpose_rhs argument in matmul
#56
sogartar
closed
2 weeks ago
5
qconv testInputAsymPerChannel_WeightAsymPerChannel_NoBias fails
#55
stellaraccident
opened
3 weeks ago
0
Add replicated tensor type and handle conv2d with sharded input channels
#54
sogartar
closed
2 weeks ago
4
Add linear quantized conv layer.
#53
stellaraccident
closed
3 weeks ago
0
[q_impls] Use int32 dtype for linear offset correction
#52
aviator19941
closed
3 weeks ago
0
[punet] Update quantizer to allow explicit mixed precision rescale.
#51
stellaraccident
closed
3 weeks ago
0
Add sharding of conv2d, group norm and layer norm
#50
sogartar
closed
3 weeks ago
1
Add some features that make it easier to do A/B comparison on punet.
#49
stellaraccident
closed
3 weeks ago
0
Iterate on quantized linear and conv layers.
#48
stellaraccident
closed
3 weeks ago
0
ROCm requirements fail install with python3.11
#47
sogartar
closed
1 week ago
2
Implement quantization import for punet model.
#46
stellaraccident
closed
1 month ago
1
Disable mmtfp op as it is no longer needed and implicated in some issues.
#45
stellaraccident
closed
1 month ago
0
Next