potential bugs in the program on the model-card page.

Hi, first thanks a lot for your work to train open-LLaMA and release open-Alpaca which definitely bring much convinence for the community. I found some bugs at https://huggingface.co/openllmplayground/openalpaca_7b_700bt_preview , which could be located in the following codes:

import torch
from transformers import LlamaForCausalLM, LlamaTokenizer
model_path = r'openllmplayground/openalpaca_7b_700bt_preview'
tokenizer = LlamaTokenizer.from_pretrained(model_path)
model = LlamaForCausalLM.from_pretrained(model_path).cuda()
tokenizer.bos_token_id, tokenizer.eos_token_id = 1,2 
instruction = r'What is an alpaca? How is it different from a llama?'
'''
instruction = r'Write an e-mail to congratulate new Standford admits and mention that you are excited about meeting all of them in person.'
instruction = r'What is the capital of Tanzania?'
instruction = r'Write a well-thought out abstract for a machine learning paper that proves that 42 is the optimal seed for training neural networks.'
'''

prompt_no_input = f'Below is an instruction that describes a task. Write a response that appropriately completes the request.\n\n### Instruction:\n{instruction}\n\n### Response:'
tokens = tokenizer.encode(prompt_no_input)

tokens = torch.LongTensor(tokens).unsqueeze(0)
instance = {'input_ids': tokens,
                    'top_k': 50,
                    'top_p': 0.9,
                    'generate_len': 128}

length = len(tokens[0])
with torch.no_grad():
    rest = model.generate(
            input_ids=tokens, 
            max_length=length+instance['generate_len'], 
            use_cache=True, 
            do_sample=True, 
            top_p=instance['top_p'], 
            top_k=instance['top_k']
        )

output = rest[0][length:]
string = tokenizer.decode(output, skip_special_tokens=True)
print(f'[!] Generation results: {string}')

point@1: at line 8, we can not set attribute of tokenizer (this line can induce Attribute Error). Besides, I do not see the meaning of this line, for the bos_token_id and the eos_token_id of tokenizer are originally 1 and 2 correspondingly. point@2: we need to use .cuda() to load the data to gpu before we send them into the model, otherwise there will be a runtime error accured. Thanks for reading this,,

yxuansu / OpenAlpaca

potential bugs in the program on the model-card page. #7