Closed rwood-97 closed 7 months ago
thanks @rwood-97! I updated a few of the other libraries and I think this largely works now. There were a few issues with like setting up the tokenizers for each model and also your comment about using legacy
code from the llama_index
library which I've fixed.
as discussed on Slack, we'll port over that code as we can customise it for different models we might use and just in case llama-index drops this in the future. it would be good to have an easy way for us to pass in prompt templates and different ways to construct prompts, as I think that's actually why we weren't successful with using the Huggingface models previously as we just didn't configure them properly (I think)
can you have a last check over please and we can merge?
OSError: You are trying to access a gated repo.
Make sure to request access at https://huggingface.co/meta-llama/Llama-2-7b-chat-hf and pass a token having permission to this repo either by logging in with `huggingface-cli login` or by passing `token=<your_token>`.
Just logging this error incase we need to address it
Update: I needed to request acces, its now in review
Okay all good :)
ah yeah, this is a good spot @rwood-97. I added a line in there to load in the tokenizer for Llama2 which needs requesting. I think we would need to maybe add a line to load in a environment variable for a HUGGINGFACE_TOKEN
. we would need that also for the Huggingface model as well if the user ever wanted to use a gated model that required some sort of access first like Llama or Gemma. I'll make a quick issue
Updating to new llama-index
Fixes #162
Notes:
messages_to_prompt
andcompletion_to_prompt
import. Potentially we should update but I can't find thecompletion_to_prompt
in the new llama-index. Maybe here or here is relevant.Settings
insetup_settings
but I don't really know if I need to do this.