codellama support? - Githubissues

johndpope commented 11 months ago

is this possible? or have to redo training? or ?

smellslikeml commented 11 months ago

It does! We've tested on a custom fine-tuned llama2-7b model (remyxai/ffmperative-7b hosted on huggingface). Both models (llama 1 & 2) use the same HF code.

guidance.llm = guidance.llms.transformers.LLaMA("remyxai/ffmperative-7b", device_map="auto")

johndpope commented 11 months ago

https://github.com/QuangBK/localLLM_guidance

@QuangBK - can you help get this working with llama2?

UPDATE - this fork by @fullstackwebdev is much better / though needs llama2 updates. Screenshot from 2023-08-09 01-37-19 2 checkpoint. https://github.com/fullstackwebdev/localLLM_guidance

Agent drop down. Screenshot from 2023-08-09 07-32-17

UPDATE

I attempt to use @danikhan632 fork - but no dice.


        else:
                # tokenizer = transformers.LlamaTokenizer.from_pretrained(MODEL_PATH, use_fast=True, device_map="auto")
                # model = transformers.LlamaForCausalLM.from_pretrained(MODEL_PATH, torch_dtype=torch.bfloat16, device_map="auto")
                # Because LLama already has role start and end, we don't need to add role_start=role_start, role_end=role_end)
                # guidance.llm = guidance.llms.transformers.LLaMA(model=model, tokenizer=tokenizer)
                guidance.llm = guidance.llms.TGWUI("http://127.0.0.1:5000")

N.B. this didn't work.


     if model_string == "TheBloke_Llama-2-13B-chat-GGML":
            MODEL_PATH =    '/media/2TB/text-generation-webui/models/TheBloke_Llama-2-13B-chat-GGML/llama-2-13b-chat.ggmlv3.q5_K_S.bin' 
            CHECKPOINT_PATH = None

attempting now to use solution as above. @smellslikeml - is there any video workflow guidance / canned prompts you crafted that would make sense and you can share?


            else:
                # tokenizer = transformers.LlamaTokenizer.from_pretrained(MODEL_PATH, use_fast=True, device_map="auto")
                # model = transformers.LlamaForCausalLM.from_pretrained(MODEL_PATH, torch_dtype=torch.bfloat16, device_map="auto")
                # Because LLama already has role start and end, we don't need to add role_start=role_start, role_end=role_end)
                # guidance.llm = guidance.llms.transformers.LLaMA(model=model, tokenizer=tokenizer)
                # guidance.llm = guidance.llms.TGWUI("http://127.0.0.1:5000")
                guidance.llm = guidance.llms.transformers.LLaMA("remyxai/ffmperative-7b", device_map="auto")

UPDATE 3

using this
        guidance.llm = guidance.llms.transformers.LLaMA("remyxai/ffmperative-7b", device_map="auto")

getting this error 
    raise NotImplementedError("In order to use chat role tags you need to use a chat-specific subclass of Transformers for your LLM from guidance.transformers.*!")
NotImplementedError: In order to use chat role tags you need to use a chat-specific subclass of Transformers for your LLM from guidance.transformers.*!

Error in program:  In order to use chat role tags you need to use a chat-specific subclass of Transformers for your LLM from guidance.transformers.*!

pip list - https://gist.github.com/johndpope/2bc86b8b976a81e47f655267c4daf537