basetenlabs / truss-examples

Examples of models deployable with Truss
https://trussml.com
MIT License
143 stars 37 forks source link

Update transformers on mistral to fix mistral examples #320

Closed squidarth closed 4 months ago

squidarth commented 4 months ago

Fix Mistral truss-examples, see issue for context. Something changed w tokenizers library that we need to update these.

This is the exception that we're seeing:

Exception while loading model

Traceback (most recent call last):
  File "/app/model_wrapper.py", line 118, in load
    self.try_load()
  File "/app/model_wrapper.py", line 179, in try_load
    retry(
  File "/app/common/retry.py", line 20, in retry
    raise exc
  File "/app/common/retry.py", line 15, in retry
    fn()
  File "/app/model/model.py", line 34, in load
    self.tokenizer = AutoTokenizer.from_pretrained(
                     ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/dist-packages/transformers/models/auto/tokenization_auto.py", line 751, in from_pretrained
    return tokenizer_class.from_pretrained(pretrained_model_name_or_path, *inputs, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/dist-packages/transformers/tokenization_utils_base.py", line 2045, in from_pretrained
    return cls._from_pretrained(
           ^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/dist-packages/transformers/tokenization_utils_base.py", line 2256, in _from_pretrained
    tokenizer = cls(*init_inputs, **init_kwargs)
                ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/dist-packages/transformers/models/llama/tokenization_llama_fast.py", line 122, in __init__
    super().__init__(
  File "/usr/local/lib/python3.11/dist-packages/transformers/tokenization_utils_fast.py", line 111, in __init__
    fast_tokenizer = TokenizerFast.from_file(fast_tokenizer_file)
                     ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Exception: data did not match any variant of untagged enum PyPreTokenizerTypeWrapper at line 40 column 3