Closed tmbcgcg closed 6 months ago
hi @tmbcgcg hi @tmbcgcg - we should fall back to 3.5 turbo tokenizer here i think
Ill add that in and get it released asap - thanks for the issue!
EDIT
Actually hold on, something is weird. it seems somehow you've swapped the API version for the model? can you show your azure settings that are not sensitive?
Hi @zzstoatzz I am actually having the same issue and nothing is weird in my azure settings. I tried the basic sample code as follows:
import marvin result = marvin.classify("yes", labels=bool)
and I obtained the following error:
KeyError Traceback (most recent call last)
Cell In[3], line 1
----> 1 import marvin
2 result = marvin.classify("yes", labels=bool)
File ~\AppData\Local\Programs\Python\Python311\Lib\site-packages\marvin\__init__.py:3
1 from .settings import settings
----> 3 from .ai.text import (
4 fn,
5 cast,
6 cast_async,
7 extract,
8 extract_async,
9 classify,
10 classify_async,
11 classifier,
12 generate,
13 generate_async,
14 model,
15 Model,
16 )
17 from .ai.images import paint, image
18 from .ai.audio import speak_async, speak, speech, transcribe, transcribe_async
File ~\AppData\Local\Programs\Python\Python311\Lib\site-packages\marvin\ai\__init__.py:3
1 from . import images
2 from . import audio
----> 3 from . import text
File ~\AppData\Local\Programs\Python\Python311\Lib\site-packages\marvin\ai\text.py:26
24 import marvin
25 import marvin.utilities.tools
---> 26 from marvin._mappings.types import (
27 cast_labels_to_grammar,
28 cast_type_to_labels,
29 )
30 from marvin.ai.prompts.text_prompts import (
31 CAST_PROMPT,
32 CLASSIFY_PROMPT,
(...)
35 GENERATE_PROMPT,
36 )
37 from marvin.client.openai import AsyncMarvinClient, ChatCompletion, MarvinClient
File ~\AppData\Local\Programs\Python\Python311\Lib\site-packages\marvin\_mappings\types.py:121
108 encoder = settings.openai.chat.completions.encoder
109 return Grammar(
110 max_tokens=max_tokens,
111 logit_bias={
(...)
115 },
116 )
119 def cast_type_to_grammar(
120 type_: Union[type, GenericAlias],
--> 121 encoder: Callable[[str], list[int]] = settings.openai.chat.completions.encoder,
122 max_tokens: Optional[int] = None,
123 enumerate_: bool = True,
124 **kwargs: Any,
125 ) -> Grammar:
126 return cast_labels_to_grammar(
127 labels=cast_type_to_labels(type_),
128 encoder=encoder,
(...)
131 **kwargs,
132 )
File ~\AppData\Local\Programs\Python\Python311\Lib\site-packages\marvin\settings.py:47, in ChatCompletionSettings.encoder(self)
43 @property
44 def encoder(self):
45 import tiktoken
---> 47 return tiktoken.encoding_for_model(self.model).encode
File ~\AppData\Local\Programs\Python\Python311\Lib\site-packages\tiktoken\model.py:101, in encoding_for_model(model_name)
96 def encoding_for_model(model_name: str) -> Encoding:
97 """Returns the encoding used by a model.
98
99 Raises a KeyError if the model name is not recognised.
100 """
--> 101 return get_encoding(encoding_name_for_model(model_name))
File ~\AppData\Local\Programs\Python\Python311\Lib\site-packages\tiktoken\model.py:88, in encoding_name_for_model(model_name)
85 return model_encoding_name
87 if encoding_name is None:
---> 88 raise KeyError(
89 f"Could not automatically map {model_name} to a tokeniser. "
90 "Please use `tiktoken.get_encoding` to explicitly get the tokeniser you expect."
91 ) from None
93 return encoding_name
KeyError: "Could not automatically map 'gpt-35-turbo' to a tokeniser. Please use `tiktoken.get_encoding` to explicitly get the tokeniser you expect."
Have you got any hint on what went wrong please ? I am new to Marvin to be frank.
this should be solved in v2.1.6 via https://github.com/PrefectHQ/marvin/pull/846
First check
Bug summary
Getting Please use
tiktoken.get_encoding
to explicitly get the tokeniser you expect.'All Azure settings are in .env, the only one am not sure about is the deployment name that I am going to confirm.
Reproduction
Error
Versions
Additional context
marvin version returns the same error :)