single_sentence_inference output is empty

better629 commented 1 year ago

transformers 4.28.0.dev0

wombat-7b is converted from decapoda-research/llama-7b-hf and your apply_delta.py code.

when using single_sentence_inference.py with path="path/to/wombat-7b",

output = tokenizer.decode(seq)
    print("output ", output)

the output is

output   Below is an instruction that describes a task. Write a response that appropriately completes the request.

### Instruction:
who are you

### Response:

with the query who are you. So return an empty answer.

Do I have wrong operations?

GanjinZero commented 1 year ago

Please use this function generate_with_prompt(inp, use_prompt=True) which adds an prompt template to your input. Wombat is fine-tuned under this template.

GanjinZero commented 1 year ago

query = "Below is an instruction that describes a task. Write a response that appropriately completes the request.\n\n### Instruction:\nwho are you\n\n### Response:"

better629 commented 1 year ago

Yes, I use the default code in single_sentence_inference.py with generate_with_prompt(inp, use_prompt=True). The above output result is under for seq in generation_output.sequences:. It seems that didn't get wright response.

GanjinZero commented 1 year ago

I run single_sentence_inference.py and obtain: As an AI language model, I do not have a physical form or identity. However, I can assist you with various tasks and provide information based on the input given to me. Can you try another query? Do you modify any codes?

better629 commented 1 year ago

The steps 1、the path of the model wombat-7b is converted from decapoda-research/llama-7b-hf and your apply_delta.py code 2、use LlamaTokenizer instead of AutoTokenizer (due to transformers 4.28.0.dev0). No other code changed. Which the operation is correct in vicuna or other code. 3、have tried some different questions, return same thing.

Any problems with step 1&2 ？

GanjinZero commented 1 year ago

For 1. please check if

print(model.model.layers[20].self_attn.v_proj.weight[10,20:30])
tensor([ 0.0270, -0.0142, -0.0256, -0.0400, -0.0592,  0.0215,  0.0117,  0.0284,
         0.0238, -0.0033])

For 2. my environment of transformers is also 4.28.0.dev0; i have tried change AutoTokenizer to LlamaTokenizer, both work well for me.

better629 commented 1 year ago

For step 1, the value is same

Loading checkpoint shards: 100%                  █████████████████      | 2/2 [00:43<00:00, 21.62s/it]
tensor([ 0.0270, -0.0142, -0.0256, -0.0400, -0.0592,  0.0215,  0.0117,  0.0284,
         0.0238, -0.0033], grad_fn=<SliceBackward0>)

it's really confused.

GanjinZero commented 1 year ago

please copy the full script you use, let me check i can figure something

better629 commented 1 year ago

Have sent to you 126 email of github

better629 commented 1 year ago

If you use decapoda-research/llama-7b-hf instead of your converted model, please update the wombat-7b/config.json with

  "bos_token_id": 1,
  "eos_token_id": 2,

then, you will get the right answer.

GanjinZero / RRHF

single_sentence_inference output is empty #14