Closed chisingh closed 3 months ago
The total_cost
has changed because DLN relies on the number of tokens. text-davinci-003
and gpt-3.5-turbo-instruct
use different tokenizers/encoders.
In [1]: import tiktoken
In [2]: gpt_3 = tiktoken.encoding_for_model("text-davinci-003")
In [3]: gpt_3_5 = tiktoken.encoding_for_model("gpt-3.5-turbo-instruct")
In [4]: gpt_3.encode("What is the second-largest city in Canada?")
Out[4]: [2061, 318, 262, 1218, 12, 28209, 1748, 287, 3340, 30]
In [5]: gpt_3_5.encode("What is the second-largest city in Canada?")
Out[5]: [3923, 374, 279, 2132, 68067, 3363, 304, 7008, 30]
In [6]: gpt_3.name
Out[6]: 'p50k_base'
In [7]: gpt_3_5.name
Out[7]: 'cl100k_base'
Overall looks good. I left a few comments/questions.
Thank you.
Looks great. I let @matheper merge it.
Can you check how to instantiate multiple models within the same execution, receiving the connection details when you instantiate/registry a model.
They used to be provided in the kwargs of completion/chat_completion, but now they go to the OpenAI object instantiation.
For example:
phi_2 = llm_registry.register(
"microsoft/phi-2",
"microsoft/phi-2",
api_base="http://something_here",
api_key="key",
)
gpt = GPT(
"gpt-4",
api_base="http://something_else",
api_key="another_key",
)
Upgrade to openai v1 python package