Open mcchung52 opened 4 months ago
I am running gpt-pilot with Llama-3-70B-Instruct.Q5_K_M It does not seem to suffer from malformed JSON issues, although it has other problems.
I changed the source to have 3000 timeout for reading instead of 300 and changed the code to use Llama tokenizer instead of tiktoken from OpenAI, Llama tokenizer uses tiktoken internally, so they are pretty close.
Thanks for sharing that. I guess I'd need a memory upgrade then. on 32gb. will 64gb do? I already changed api timeout to 30 min because I still got "api timeout" error w/ 10min connecting to LM Studio (llm on cpu). Curious how you changed to Llama tokenizer. Do you mind sharing changes? Thanks
Thanks for sharing that. I guess I'd need a memory upgrade then. on 32gb. will 64gb do? I already changed api timeout to 30 min because I still got "api timeout" error w/ 10min connecting to LM Studio (llm on cpu). Curious how you changed to Llama tokenizer. Do you mind sharing changes? Thanks
import re import requests import os import sys import time import json import tiktoken from prompt_toolkit.styles import Style
from jsonschema import validate, ValidationError from utils.style import color_red, color_yellow from typing import List from const.llm import MAX_GPT_MODEL_TOKENS, API_CONNECT_TIMEOUT, API_READ_TIMEOUT
API_READ_TIMEOUT=3000 from const.messages import AFFIRMATIVE_ANSWERS from logger.logger import logger, logging from helpers.exceptions import TokenLimitError, ApiKeyNotDefinedError, ApiError from utils.utils import fix_json, get_prompt from utils.function_calling import add_function_calls_to_request, FunctionCallSet, FunctionType from utils.questionary import styled_text
from .telemetry import telemetry
from transformers import AutoTokenizer tokenizer = AutoTokenizer.from_pretrained("meta-llama/Meta-Llama-3-70B", revision="refs/pr/6")
--------------------------------------------------- just a few top lines from the file.
this file is in utils folder, llm_connection.py
Yikes.. getting an error
Cannot access gated repo for url https://huggingface.co/meta-llama/Meta-Llama-3-8B/resolve/main/config.json.
Access to model meta-llama/Meta-Llama-3-8B is restricted. You must be authenticated to access it.
Yikes.. getting an error
Cannot access gated repo for url https://huggingface.co/meta-llama/Meta-Llama-3-8B/resolve/main/config.json. Access to model meta-llama/Meta-Llama-3-8B is restricted. You must be authenticated to access it.
Probably. It is free though.
Are you storing your token somewhere? how is it pulling?
Not sure what you mean by storing token. If you navigate to Hugging Face and try to access Llama models it will ask you to go through a quick process. You can pretty much invent the info. At some point in time, I think, I also set something up with ssh keys and HF account.
Version
Command-line (Python) version
Suggestion
Using LM Studio, local LLM is stuck in task generation phase due to invalid JSON response. After spending a lot of time generating since task list is long, output produced is mostly correct except couple syntax error.
If you can also point me to right places, I can try to create a PR. Thanks in advance for your help~
Background: I'm using local LLM and NOT OpenAI due to cost and privacy issues, so I need to make this work for local LLM, which kinda introduces problems, I guess, that are not happening w/ OpenAI. Also, generation is slow but MVG (minimum viable goal) is to see if this works no matter how long it takes.