c0sogi / llama-api

An OpenAI-like LLaMA inference API
MIT License
111 stars 9 forks source link

Is there a way to use this on google Colab and have the url be public? #11

Open ashercn97 opened 1 year ago

ashercn97 commented 1 year ago

I would love to use this in google Colab but I would need the url to be public, is there a way to do that with this?

c0sogi commented 1 year ago
""" ------------------------------------------------------------------------------------- '''
!!!  첫 사용 시 반드시 상단 메뉴 [런타임] -> [런타임 유형 변경] -> [하드웨어 가속기]를 [GPU] 및 [T4]로 설정
 *   이후 [런타임] -> [모두 실행] 클릭. 서버가 켜질 때까지 보통 5분, 느리면 10분 가량 소요.
 *   일정 시간 사용할 경우 구글 캡차가 수시로 뜨며 코랩을 중단시키려 하므로, 모니터 한 쪽에 창 켜둬야 함
 *   English: [Runtime] -> [Change runtime type] -> Select [T4] -> [Runtime] -> [Run all]
''' ------------------------------------------------------------------------------------- """

USE_GOOGLE_DRIVE = False
PORT = 8000

model_definitions = {
    "ggml": {
        "type": "llama.cpp",
        "model_path": "TheBloke/MythoMax-L2-Kimiko-v2-13B-GGUF",
        "max_total_tokens": 4096,
    },
    "gptq": {
        "type": "exllama",
        "model_path": "TheBloke/MythoMax-L2-Kimiko-v2-13B-GPTQ",
        "max_total_tokens": 4096
    },
}
openai_replacements = {"gpt-3.5-turbo": "ggml", "gpt-4": "gptq"}
import json
import os
os.environ["MODEL_DEFINITIONS"] = json.dumps(model_definitions)
os.environ["OPENAI_REPLACEMENTS"] = json.dumps(openai_replacements)

# ==================================================

if USE_GOOGLE_DRIVE:
  from google.colab import drive
  drive.mount("/content/drive/")
else:
  !mkdir -p /content/drive/MyDrive/
%cd /content/drive/MyDrive/

!git clone --quiet https://github.com/c0sogi/llama-api llama-api
%cd llama-api
!python -m main --port {PORT} --tunnel --install-pkgs --skip-tf-install --skip-torch-install

I think this works and you can see a public link when all jobs finish.