guidance-ai / guidance

A guidance language for controlling large language models.
MIT License
18.16k stars 1.01k forks source link

codellama support? #328

Closed johndpope closed 7 months ago

johndpope commented 11 months ago

is this possible? or have to redo training? or ?

smellslikeml commented 11 months ago

It does! We've tested on a custom fine-tuned llama2-7b model (remyxai/ffmperative-7b hosted on huggingface). Both models (llama 1 & 2) use the same HF code.

guidance.llm = guidance.llms.transformers.LLaMA("remyxai/ffmperative-7b", device_map="auto")

johndpope commented 11 months ago

https://github.com/QuangBK/localLLM_guidance

@QuangBK - can you help get this working with llama2?

UPDATE - this fork by @fullstackwebdev is much better / though needs llama2 updates. Screenshot from 2023-08-09 01-37-19 2 checkpoint. https://github.com/fullstackwebdev/localLLM_guidance

Agent drop down. Screenshot from 2023-08-09 07-32-17

UPDATE

I attempt to use @danikhan632 fork - but no dice.


        else:
                # tokenizer = transformers.LlamaTokenizer.from_pretrained(MODEL_PATH, use_fast=True, device_map="auto")
                # model = transformers.LlamaForCausalLM.from_pretrained(MODEL_PATH, torch_dtype=torch.bfloat16, device_map="auto")
                # Because LLama already has role start and end, we don't need to add role_start=role_start, role_end=role_end)
                # guidance.llm = guidance.llms.transformers.LLaMA(model=model, tokenizer=tokenizer)
                guidance.llm = guidance.llms.TGWUI("http://127.0.0.1:5000")

N.B. this didn't work.


     if model_string == "TheBloke_Llama-2-13B-chat-GGML":
            MODEL_PATH =    '/media/2TB/text-generation-webui/models/TheBloke_Llama-2-13B-chat-GGML/llama-2-13b-chat.ggmlv3.q5_K_S.bin' 
            CHECKPOINT_PATH = None

attempting now to use solution as above. @smellslikeml - is there any video workflow guidance / canned prompts you crafted that would make sense and you can share?


            else:
                # tokenizer = transformers.LlamaTokenizer.from_pretrained(MODEL_PATH, use_fast=True, device_map="auto")
                # model = transformers.LlamaForCausalLM.from_pretrained(MODEL_PATH, torch_dtype=torch.bfloat16, device_map="auto")
                # Because LLama already has role start and end, we don't need to add role_start=role_start, role_end=role_end)
                # guidance.llm = guidance.llms.transformers.LLaMA(model=model, tokenizer=tokenizer)
                # guidance.llm = guidance.llms.TGWUI("http://127.0.0.1:5000")
                guidance.llm = guidance.llms.transformers.LLaMA("remyxai/ffmperative-7b", device_map="auto")

UPDATE 3

using this
        guidance.llm = guidance.llms.transformers.LLaMA("remyxai/ffmperative-7b", device_map="auto")

getting this error 
    raise NotImplementedError("In order to use chat role tags you need to use a chat-specific subclass of Transformers for your LLM from guidance.transformers.*!")
NotImplementedError: In order to use chat role tags you need to use a chat-specific subclass of Transformers for your LLM from guidance.transformers.*!

Error in program:  In order to use chat role tags you need to use a chat-specific subclass of Transformers for your LLM from guidance.transformers.*!

pip list - https://gist.github.com/johndpope/2bc86b8b976a81e47f655267c4daf537
danikhan632 commented 11 months ago

Let me try updating, unsure if repo is still active

VarunGumma commented 9 months ago

Any update on using LLama2 chat models with guidance ??

danikhan632 commented 9 months ago

expect something maybe friday

johndpope commented 9 months ago

it looks like @iiis-ai has a working example with llama 1 / maybe working with llama2?

https://github.com/iiis-ai/cumulative-reasoning/blob/6c8632577699a8b3f8eee88671ee83c677fa4aea/AutoTNLI/autotnli-direct.py#L22

guidance.llm = guidance.llms.transformers.LLaMA(args.model, device_map="auto", token_healing=True, torch_dtype=torch.bfloat16) https://github.com/yifanzhang-pro/cumulative-reasoning-anonymous/blob/07bcc6b21aedbee7c82f44b52aa3c0fc123e4d03/AutoTNLI/autotnli-cr.py#L27

marcotcr commented 7 months ago

LLama2 works fine in the new release, both with HF transformers and with llama.cpp. Please check this out