marella / ctransformers

Python bindings for the Transformer models implemented in C/C++ using GGML library.
MIT License
1.81k stars 137 forks source link

Model crashes when tokenizing 4 digit numbers #22

Closed underwaterbepis closed 1 year ago

underwaterbepis commented 1 year ago

model.tokenize("1111")

crashes the python interpreter

if relevant, I'm using https://huggingface.co/TheBloke/Nous-Hermes-13B-GGML

marella commented 1 year ago

I pushed a fix. Will release it in the next version. If you would like to try it before the next release, you can install from source:

pip install git+https://github.com/marella/ctransformers
marella commented 1 year ago

This is fixed in the latest version 0.2.9