siddharth-sharma7 fast-Bart issues - Githubissues

siddharth-sharma7 / fast-Bart

Convert BART models to ONNX with quantization. 3X reduction in size, and upto 3X boost in inference speed

34 stars 3 forks source link

issues

Newest

Newest Most commented Recently updated Oldest Least commented Least recently updated

Update requirements.txt

#6 adilrahman opened 1 year ago
0
TypeError: quantize_dynamic() got an unexpected keyword argument 'activation_type'

#5 arnabmanna619 opened 2 years ago
3
AttributeError: 'BartEncoder' object has no attribute 'main_input_name'

#4 zeke-john opened 2 years ago
1
Inference slower than Pytorch model for long sequence length

#3 jasontian6666 opened 2 years ago
1
unable to use cudaprovider for inferencing

#2 girishnadiger-gep opened 2 years ago
3
deploy onnx model to tensorrt

#1 will-wiki opened 2 years ago
1