issues
search
siddharth-sharma7
/
fast-Bart
Convert BART models to ONNX with quantization. 3X reduction in size, and upto 3X boost in inference speed
34
stars
3
forks
source link
issues
Newest
Newest
Most commented
Recently updated
Oldest
Least commented
Least recently updated
Update requirements.txt
#6
adilrahman
opened
1 year ago
0
TypeError: quantize_dynamic() got an unexpected keyword argument 'activation_type'
#5
arnabmanna619
opened
2 years ago
3
AttributeError: 'BartEncoder' object has no attribute 'main_input_name'
#4
zeke-john
opened
2 years ago
1
Inference slower than Pytorch model for long sequence length
#3
jasontian6666
opened
2 years ago
1
unable to use cudaprovider for inferencing
#2
girishnadiger-gep
opened
2 years ago
3
deploy onnx model to tensorrt
#1
will-wiki
opened
2 years ago
1