janhq / models

Models support in Jan and Cortex
MIT License
5 stars 2 forks source link

model: llama3.1 #16

Closed dan-homebrew closed 3 weeks ago

dan-homebrew commented 2 months ago

Goal

nguyenhoangthuan99 commented 2 months ago

This ticket also resolve issue janhq/cortex.cpp#295

nguyenhoangthuan99 commented 1 month ago

Here is python script that run function calling, GitHub block .py file, so I cannot upload it.

import requests, json

ENDPOINT = "https://litellm.jan.ai/v1/chat/completions" # "http://localhost:3928/v1/chat/completions" #
MODEL = "alan-gift" # "meta-llama3.1-8b-instruct" #

grammar = """
root   ::= object
value  ::= object | array | string | number | ("true" | "false" | "null") ws

object ::=
  "{" ws (
            string ":" ws value
    ("," ws string ":" ws value)*
  )? "}" ws

array  ::=
  "[" ws (
            value
    ("," ws value)*
  )? "]" ws

string ::=
  "\"" (
    [^"\\\x7F\x00-\x1F] |
    "\\" (["\\bfnrt] | "u" [0-9a-fA-F]{4}) # escapes
  )* "\"" ws

number ::= ("-"? ([0-9] | [1-9] [0-9]{0,15})) ("." [0-9]+)? ([eE] [-+]? [0-9] [1-9]{0,15})? ws

# Optional space: by convention, applied in this grammar after literal chars when allowed
ws ::= | " " | "\n" [ \t]{0,20}
"""

system_prompt = """
Environment: ipython
Tools: brave_search, wolfram_alpha
Cutting Knowledge Date: December 2023
Today Date: 20 September 2024

# Tool Instructions
- Always execute python code in messages that you share.
- When looking for real time information use relevant functions if available else fallback to brave_search

You have access to the following CUSTOM functions:

Use the function 'spotify_trending_songs' to: Get top trending songs on Spotify
{
  "name": "spotify_trending_songs",
  "description": "Get top trending songs on Spotify",
  "parameters": {
    "n": {
      "param_type": "int",
      "description": "Number of trending songs to get",
      "required": true
    }
  }
}

Use the function 'get_current_conditions' to: Get the current weather conditions for a specific location
{
    "type": "function",
    "function": {
    "name": "get_current_conditions",
    "description": "Get the current weather conditions for a specific location",
    "parameters": {
        "type": "object",
        "properties": {
        "location": {
            "type": "string",
            "description": "The city and state, e.g., San Francisco, CA"
        },
        "unit": {
            "type": "string",
            "enum": ["Celsius", "Fahrenheit"],
            "description": "The temperature unit to use. Infer this from the user's location."
        }
        },
        "required": ["location", "unit"]
    }
    }
}

If a you choose to call a CUSTOM function ONLY reply in the following format:
<{start_tag}={function_name}>{parameters}{end_tag}
where

start_tag => `<function`
parameters => a JSON dict with the function argument name as key and function argument value as value.
end_tag => `</function>`

Here is an example,
<function=example_function_name>{"example_name": "example_value"}</function>

Reminder:
- Function calls MUST follow the specified format
- Required parameters MUST be specified
- Only call one function at a time
- Put the entire function call reply on one line
- Always add your sources when using search results to answer the user query
- If can not find correct parameters corresponding to function, ask user again to provide.
- No explanation are needed when calling a function.

You are a helpful assistant.
"""
user_prompt = "Who is US president in 2024"
system = {"role":"system","content":system_prompt}
user = {"role":"user","content":user_prompt}

messages = [system,user]
body = {
    "model": MODEL,
    "messages": messages,
    "top_p":0.9,
    "top_k":40,
    "temperature":0.6,
    "stop" : ["</s>","<|eot_id|>"],
    "grammar":grammar,
}

result = requests.post(ENDPOINT, json=body,headers={'content-type': 'application/json'}).json()
print(json.dumps(result,indent=4))
assitant = result["choices"][0]["message"]
users2 = {"role":"user","content":"Maybe CA"}
# ipython = {"role":"ipython",""}
messages = [system,user,assitant,users2]

body = {
    "model": MODEL,
    "messages": messages,
    "top_p":0.9,
    "temperature":0.6,
    "stop" : ["</s>","<|eot_id|>","<|eom_id|>"],
    "grammar":grammar,
}
result = requests.post(ENDPOINT, json=body,headers={'content-type': 'application/json'}).json()
print(json.dumps(result,indent=4))

cc @dan-homebrew @0xSage

dan-homebrew commented 1 month ago

FYI: unable to run llama3.1 on v1.0.0-151

Image

cortex.log cortex-cli.log