Open milosacimovic opened 3 months ago
Thank you for pointing it out. You need to change a processor to rely on NumPy, plus rewrite a bit of conversion script to use ONNX instead of PyTorch. We will do it shortly, but any contribution from your side that can accelerate it is welcome.
Do you know of any way of exporting the tokenizer into onnx as well, because right now it seems it is using torch as well through transformers. i.e. it's loaded using AutoTokenizer from transformers which relies on torch
Is it possible to export to ONNX and run inference without depending on PyTorch?