functions install get eror

LocalAI version:

latest version from today Environment, CPU architecture, OS, and Version:

uname -a
Linux office 6.5.0-21-generic #21~22.04.1-Ubuntu SMP PREEMPT_DYNAMIC Fri Feb  9 13:32:52 UTC 2 x86_64 x86_64 x86_64 GNU/Linux

Describe the bug

i cant install it To Reproduce

using this image, docker run --rm -ti --gpus all -p 8080:8080 --env-file .env -e DEBUG=true -e MODELS_PATH=/models -e THREADS=4 -v $PWD/models:/models quay.io/go-skynet/local-ai:master-cublas-cuda12-ffmpeg Expected behavior

working function container with no error Logs

 docker-compose up --build functions
Building functions
[+] Building 242.2s (10/10) FINISHED      docker:default
 => [internal] load build definition from Dockerfi  0.2s
 => => transferring dockerfile: 200B                0.1s
 => [internal] load metadata for docker.io/library  1.1s
 => [auth] library/python:pull token for registry-  0.0s
 => [internal] load .dockerignore                   0.1s
 => => transferring context: 120B                   0.1s
 => [internal] load build context                 164.9s
 => => transferring context: 3.99GB               164.5s
 => CACHED [1/4] FROM docker.io/library/python:3.1  0.1s
 => => resolve docker.io/library/python:3.10-bulls  0.1s
 => [2/4] COPY . /app                              27.4s
 => [3/4] WORKDIR /app                              0.1s
 => [4/4] RUN pip install --no-cache-dir -r requi  21.9s
 => exporting to image                             25.9s 
 => => exporting layers                            25.9s 
 => => writing image sha256:b5a0eca8b94b38b2bac60b  0.0s 
 => => naming to docker.io/library/localai_functio  0.0s 
Recreating localai_functions_1 ... done
Attaching to localai_functions_1
functions_1  | Traceback (most recent call last):
functions_1  |   File "/app/./functions-openai.py", line 76, in <module>
functions_1  |     print(run_conversation())
functions_1  |   File "/app/./functions-openai.py", line 43, in run_conversation
functions_1  |     response_message = response["choices"][0]["message"]
functions_1  | KeyError: 'choices'
localai_functions_1 exited with code 1

Additional context

gpt-3.5-turbo.yaml

name: gpt-3.5-turbo

# Model parameters
parameters:
  model: functionary-medium-v2.2.q4_0.gguf
  temperature: 0.9
  top_k: 80
  top_p: 0.9
  max_tokens: 16384
  ignore_eos: false
  n_keep: 10
  seed: 0.0
  mode: default
  step: 0
  negative_prompt: false
  typical_p: 0.9
  tfz: 
  frequency_penalty: 0 # number (minimum: 0, maximum: 2)
  repeat_penalty: 1.1 # number (minimum: 0, maximum: 2)
  mirostat_eta: 0.1 # Mirostat configuration (llama.cpp only)
  mirostat_tau: 5
  mirostat: 1 # mode=1 is for llama.cpp only.
  rope_freq_base: 1000000
  rope_freq_scale: 1
  negative_prompt_scale:

# Default context size
context_size: 16384
batch: 256
# Default number of threads
threads: 6
# Define a backend (optional). By default it will try to guess the backend the first time the model is interacted with.
backend: llama #available: llama-stable llama, stablelm, gpt2, gptj rwkv
# stopwords (if supported by the backend)
stopwords:
- "HUMAN:"
- "### Response:"
# string to trim space to
trimspace:
- string
# Strings to cut from the response
cutstrings:
- "string"

# Directory used to store additional assets
asset_dir: ""

# define chat roles
roles:
  function: 'Function Result:'
  assistant_function_call: 'Function Call:'
  assistant: '### Response:'
  system: '### System Instruction:'
  user: '### Instruction:'

# define template
template:
  # template file ".tmpl" with the prompt template to use by default on the endpoint call. Note there is no extension in the files

  instruction: functionary

function:
   disable_no_action: true
   no_action_function_name: "reply"
   no_action_description_name: "Reply to the AI assistant"
   parallel_calls: true

system_prompt:
rms_norm_eps:
# Set it to 8 for llama2 70b
ngqa: 1

## LLAMA specific options
# Enable F16 if backend supports it
f16: true

# Enable debugging
debug: true

# Enable embeddings
embeddings: true

# GPU Layers (only used when built with cublas)
gpu_layers: 8

# Enable memory lock
mmlock: false

# GPU setting to split the tensor in multiple parts and define a main GPU
# see llama.cpp for usage
tensor_split: ""
main_gpu: "0"

# Define a prompt cache path (relative to the models)
prompt_cache_path: "prompt-cache"
# Cache all the prompts
prompt_cache_all: true

# Read only
prompt_cache_ro: true

# Enable mmap
mmap: false

# Enable low vram mode (GPU only)
low_vram: true

# Set NUMA mode (CPU only)
numa: false

# Lora settings
#lora_adapter: "/path/to/lora/adapter"
#lora_base: "/path/to/lora/base"

# Disable mulmatq (CUDA)
no_mulmatq: true

# Diffusers/transformers
cuda: false

using this template functionary:

<|from|>system
<|recipient|>all
<|content|>// Supported function definitions that should be called when necessary.
namespace functions {
// Get the current weather
type get_current_weather = (_: {
// The city and state, e.g. San Francisco, CA
location: string,
}) => any;
} // namespace functions
<|from|>system
<|recipient|>all
<|content|>A chat between a curious user and an artificial intelligence assistant. The assistant gives helpful, detailed, and polite answers to the user's questions. The assistant calls functions with appropriate input when necessary
<|from|>user
<|recipient|>all
<|content|>What is the weather for Istanbul?

the model works in chat with anytingllm but I would like to test witn auto-gpt using the functions but I get an error when installing the functions container any help wil be apreciate

my docker-compose.yaml

version: "3.9"

networks:
  private_network:
    external: true

services:
  api:
    deploy:
      resources:
        reservations:
          devices:
            - driver: nvidia
              count: all
              capabilities: [gpu]
              options: 
                memory: "8G"  # Setează 6 GB pentru VRAM
    image: quay.io/go-skynet/local-ai:master-cublas-cuda12-ffmpeg
    restart: always # should this be on-failure ?
    build:
      context: .
      dockerfile: Dockerfile
      args:
        BUILDKIT_INLINE_CACHE: 1
        # specify which cuda version your card supports: https://developer.nvidia.com/cuda-gpus
    ports:
      - 8080:8080
    mem_limit: "48G"
    env_file:
      - .env
    environment:
      - NVIDIA_VISIBLE_DEVICES=all
      - NVIDIA_DRIVER_CAPABILITIES=all
      - DOCKER_BUILDKIT=1
      - DEBUG=true
      - MODELS_PATH=/models
    volumes:
      - ./models:/models:cached
      - ./images/:/tmp/generated/images/
      - ./upload:/tmp/localai/upload/
    command: ["/usr/bin/local-ai"]
    healthcheck:
      test: ["CMD", "curl", "-f", "http://api:8080/readyz"]
      interval: 1m
      timeout: 20m
      retries: 20
    tty: true # enable colorized logs
    networks:
      - private_network

  functions:
    tty: true # enable colorized logs
    env_file:
      - .env
    build:
      context: .
      dockerfile: Dockerfile.Functions
    networks:
      - private_network
    extra_hosts:
      - "host.docker.internal:host-gateway"

  auto-gpt:
    image: significantgravitas/auto-gpt
    ports:
      - 5000:5000
    depends_on:
      #functions:
      #  condition: Service_started
      api:
        condition: service_healthy
      redis:
        condition: service_started
    healthcheck:
      test: ["CMD-SHELL", "curl -f http://api:8080 || exit 1"]
      interval: 10s
      retries: 6
      start_period: 5m
    env_file:
      - .env
    environment:
      - 'MEMORY_BACKEND=${MEMORY_BACKEND:-redis}'
      - 'REDIS_HOST=${REDIS_HOST:-redis}'
    profiles: ["exclude-from-up"]
    volumes:
      - ./auto_gpt_workspace:/app/autogpt/auto_gpt_workspace
      - ./data:/app/data
      # allow auto-gpt to write logs to disk
      - ./logs:/app/logs
      # uncomment following lines if you want to make use of these files
      # you must have them existing in the same folder as this docker-compose.yml
      #- type: bind
      #  source: ./azure.yaml
      #  target: /app/azure.yaml
      #- type: bind
      #  source: ./ai_settings.yaml
      #  target: /app/ai_settings.yaml
    networks:
      - private_network
    extra_hosts:
      - "host.docker.internal:host-gateway"

  redis:
    image: "redis/redis-stack-server:latest"
    env_file:
      - .env
    networks:
      - private_network

mudler / LocalAI

functions install get eror #1814