openai / tiktoken

tiktoken is a fast BPE tokeniser for use with OpenAI's models.
MIT License
11.98k stars 816 forks source link

Use a custom exception ValueError subclass for the special tokens warning #290

Open simonw opened 5 months ago

simonw commented 5 months ago

This code here: https://github.com/openai/tiktoken/blob/39f29cecdb6fc38d9a3434e5dd15e4de58cf3c80/tiktoken/core.py#L375-L383

I wanted to do something special on this exception in my own code, so I had to write this:

try:
    tokens = encoding.encode(text, **kwargs)
except ValueError as ex:
    if 'disallowed special token' in str(ex):
        # Do something special

I suggest having a custom exception class for this instead:

class DisallowedSpecialTokenError(ValueError):
    pass

Raising that class instead would let people like me catch it explicitly, and since it's a subclass of ValueError it should not break any existing code that currently catches ValueError directly.