siddharth-sharma7 / fast-Bart

Convert BART models to ONNX with quantization. 3X reduction in size, and upto 3X boost in inference speed
34 stars 3 forks source link