microsoft / deep-language-networks

We view Large Language Models as stochastic language layers in a network, where the learnable parameters are the natural language prompts at each layer. We stack two such layers, feeding the output of one layer to the next. We call the stacked architecture a Deep Language Network - DLN
MIT License
87 stars 12 forks source link

Upgrade openai package #51

Closed chisingh closed 3 months ago

chisingh commented 3 months ago

Upgrade to openai v1 python package

matheper commented 3 months ago

The total_cost has changed because DLN relies on the number of tokens. text-davinci-003 and gpt-3.5-turbo-instruct use different tokenizers/encoders.

In [1]: import tiktoken

In [2]: gpt_3 = tiktoken.encoding_for_model("text-davinci-003")

In [3]: gpt_3_5 = tiktoken.encoding_for_model("gpt-3.5-turbo-instruct")

In [4]: gpt_3.encode("What is the second-largest city in Canada?")
Out[4]: [2061, 318, 262, 1218, 12, 28209, 1748, 287, 3340, 30]

In [5]: gpt_3_5.encode("What is the second-largest city in Canada?")
Out[5]: [3923, 374, 279, 2132, 68067, 3363, 304, 7008, 30]

In [6]: gpt_3.name
Out[6]: 'p50k_base'

In [7]: gpt_3_5.name
Out[7]: 'cl100k_base'
chisingh commented 3 months ago

Overall looks good. I left a few comments/questions.

Thank you.

MarcCote commented 3 months ago

Looks great. I let @matheper merge it.

matheper commented 3 months ago

Can you check how to instantiate multiple models within the same execution, receiving the connection details when you instantiate/registry a model.

They used to be provided in the kwargs of completion/chat_completion, but now they go to the OpenAI object instantiation.

For example:

phi_2 = llm_registry.register(
    "microsoft/phi-2",
    "microsoft/phi-2",
    api_base="http://something_here",
    api_key="key",
)
gpt = GPT(
    "gpt-4",
    api_base="http://something_else",
    api_key="another_key",
)