1b5d / llm-api

Run any Large Language Model behind a unified API
MIT License
159 stars 25 forks source link

Example config file #9

Closed hussainwali74 closed 11 months ago

hussainwali74 commented 1 year ago

this repo really taught me why a running example is more important than the actual project.

Tried everything on the readme but couldn't get this to work, image

config.yml:

  # models_dir: /models
# model_family: gptq_llama
# setup_params:
#   repo_id: repo_id
#   filename: model.safetensors
# model_params:
#   group_size: 128
#   wbits: 4
#   cuda_visible_devices: "0"
#   device: "cuda:0"
#   st_device: 0

# file: config.yaml

# models_dir: /models
# model_family: vicuna
# setup_params:
#   repo_id: TheBloke/vicuna-13B-1.1-GPTQ-4bit-128g
#   filename: vicuna-13B-1.1-GPTQ-4bit-128g.compat.no-act-order.pt
#   convert: false
#   migrate: false
# model_params:
#   group_size: 128
#   wbits: 4
#   cuda_visible_devices: "0"
#   device: "cuda:0"
#   st_device: 0
#   ctx_size: 2000

#----------------------- alpaca
models_dir: /models
model_family: alpaca
model_name: 7b
setup_params:
  repo_id: Sosaka/Alpaca-native-4bit-ggml
  filename: ggml-alpaca-7b-q4.bin
  convert: false
  migrate: false
model_params:
  ctx_size: 2000
  seed: -1
  n_threads: 8
  n_batch: 2048
  n_parts: 01
  last_n_tokens_size: 16
#-----------------------

# models_dir: /models     # dir inside the container
# model_family: alpaca
# model_name: 7b
# setup_params:
#   key: value
# model_params:
#   key: value

# models_dir: /models     # dir inside the container
# model_family: alpaca
# model_name: 7b
# setup_params:
#   repo_id: user/repo_id
#   filename: ggml-model-q4_0.bin
#   convert: false
#   migrate: false
# model_params:
#   ctx_size: 2000
#   seed: -1
#   n_threads: 8
#   n_batch: 2048
#   n_parts: -1
#   last_n_tokens_size: 16

Models directory:

image

1b5d commented 1 year ago

Sorry for this took too long, I just saw it. Can you remove the convert: false and migrate: false and try again? these are only supported for the ggml format using llama.cpp

1b5d commented 11 months ago

Closing this as a new config example file and the README file have been updated