clovaai / donut

Official Implementation of OCR-free Document Understanding Transformer (Donut) and Synthetic Document Generator (SynthDoG), ECCV 2022
https://arxiv.org/abs/2111.15664
MIT License
5.75k stars 466 forks source link

Converting trained Donut model to a single onnx file #230

Open TanvirHundredOne opened 1 year ago

TanvirHundredOne commented 1 year ago

I'm working on a production solution where I want to relay my Donut model to an Nivdia Triton Inference server. But I'm struggling to convert my Donut model and associated files( tokenizer, processor etc. ) to a single onnx file, which is preferred for Triton.

I've had some limited success where I ended up with an onnx file and some other meta data. Can anyone please help me in packaging it into a single file?

I have tried to create onnx file using optimum library, but it creates this file structure

|-- added_tokens.json |-- config.json |-- decoder_model.onnx |-- decoder_model.onnx_data |-- decoder_with_past_model.onnx |-- decoder_with_past_model.onnx_data |-- encoder_model.onnx |-- generation_config.json |-- preprocessor_config.json |-- sentencepiece.bpe.model |-- special_tokens_map.json |-- tokenizer_config.json `-- tokenizer.json

Where Ideally there should be a single donut_model.onnx file.

Thanks in Advance

MatinHz commented 1 year ago

You can use--monolith flag in optimum export onnx command to force it to export a single ONNX file, but this is not recommended for encoder-decoder models like Donut

  --monolith            Forces to export the model as a single ONNX file. By default, the ONNX exporter may break the
                        model in several ONNX files, for example for encoder-decoder models where the encoder should be
                        run only once while the decoder is looped over.
abdullaha1rafi commented 1 year ago

How do I run inference using the donut_model.onnx? Huggingface's solution wants the decoupled .onnx files.