segment-any-text / wtpsplit

Toolkit to segment text into sentences or other semantic units in a robust, efficient and adaptable way.
MIT License
753 stars 44 forks source link

add ONNX support for SaT models #129

Closed markus583 closed 2 months ago

markus583 commented 2 months ago

This adds ONNX support for sat, sat-sm, and sat-lora models, and includes documentation and testing.

TODOs:

markus583 commented 2 months ago

I had to remove the exact timings. ORT on GPU just doesn't work on any of my systems, even with extensive trial and error. If you're fine with it @bminixhofer, we can merge!

bminixhofer commented 2 months ago

The naming of the inputs in the ONNX model was swapped (the attention_mask input was named input_ids and vice versa). It still worked because the arguments in the call in extract.py were swapped again, but I removed both swaps now (one by changing arg order in the export).

Also added timings for onnxruntime on GPU, it is indeed ~50% faster!

LGTM now, @markus583 maybe take a final look then we can merge and release.