Open iwinterknight opened 1 year ago
Load the tokenizer from the base model (llama) not from your checkpoint.
Load the tokenizer from the base model (llama) not from your checkpoint.
So i load tokenizer from base model and base_model from checkpoint ?
Load the tokenizer from the base model (llama) not from your checkpoint.
So i load tokenizer from base model and base_model from checkpoint ?
I saw the output folder did not contain the tokenizer file, so it is only used for load checkpoint, not include the tokenizer, and I have to load the pre-trained again to have tokenizer. My solution is saving tokenizer after training with tokenizer.save_pretrained("MY_OUTPUT_DIRECTORY").
I hope it will help you
I want to export the trained model as pytorch checkpoint. For that do i need to load the model before running export_state_dict_checkpoint.py ?
If so, when I try to load the model like :
checkpoint = '/content/gdrive/MyDrive/Projects/alpaca_lora/alpaca-lora/lora-alpaca/checkpoint-800' tokenizer = LlamaTokenizer.from_pretrained(checkpoint) model, tokenizer, prompter = load(lora_weights=checkpoint, tokenizer=tokenizer)
where load() is :
I get an error :
OSError: Can't load tokenizer for '/content/gdrive/MyDrive/Projects/alpaca_lora/alpaca-lora/lora-alpaca/checkpoint-800'. If you were trying to load it from 'https://huggingface.co/models', make sure you don't have a local directory with the same name. Otherwise, make sure '/content/gdrive/MyDrive/Projects/alpaca_lora/alpaca-lora/lora-alpaca/checkpoint-800' is the correct path to a directory containing all relevant files for a LlamaTokenizer tokenizer.