segment-any-text / wtpsplit

Toolkit to segment text into sentences or other semantic units in a robust, efficient and adaptable way.
MIT License
695 stars 39 forks source link

model in huggingface cannot load mixtures.skops #110

Closed syeelou closed 10 months ago

syeelou commented 10 months ago

When I load the mini model downloaded by huggingface (https://huggingface.co/benjamin/wtp-bert-mini), I could not load the file mixtures.skops, resulting in the parameter style being unable to be used during segmentation. Found through query code https://github.com/bminixhofer/wtpsplit/blob/main/wtpsplit/__init__.py#L48

The file names are inconsistent. Change the downloaded file name from mixtures.skops to mixture.skops and you can load it successfully. I hope to modify the code or modify the file name in hf

bminixhofer commented 10 months ago

Hi, thanks, good catch! I guess this also happened in #108. It's fixed in version 1.2.4 via https://github.com/bminixhofer/wtpsplit/commit/41821e00dd7d2a990b82426ef90c52609edc818a.