run generate.py always generate same results

LiuPearl1 commented 1 year ago

I meet a problem. After finetuning the 7B model, I want to test the finetuned model.

In reference, I run: CUDA_VISIBLE_DEVICES=0 python generate.py \ --load_8bit \ --base_model './llama-7B-hf' \ --lora_weights './lora-alpaca-multigpu'

base_model is the original llama-7B model location, lora_weights is new generated model location:

I find that whether I add lora_weights or not, the inference results is same. I have changed temperature, it doesn't work. What's wrong with my inference?

With loading lora_weights, the code like this:

Without loading lora_weights, the code like this:

1500256797 commented 1 year ago

i meet same issues.

===================================BUG REPORT===================================
Welcome to bitsandbytes. For bug reports, please submit your error trace to: https://github.com/TimDettmers/bitsandbytes/issues
================================================================================
CUDA exception! Error code: no CUDA-capable device is detected
CUDA exception! Error code: initialization error
CUDA SETUP: CUDA runtime path found: /home/jianjianjianjian/miniconda3/envs/alpaca_dev/lib/libcudart.so
/home/jianjianjianjian/miniconda3/envs/alpaca_dev/lib/python3.10/site-packages/bitsandbytes/cuda_setup/main.py:136: UserWarning: WARNING: No GPU detected! Check your CUDA paths. Proceeding to load CPU-only library...
  warn(msg)
CUDA SETUP: Detected CUDA version 117
CUDA SETUP: Loading binary /home/jianjianjianjian/miniconda3/envs/alpaca_dev/lib/python3.10/site-packages/bitsandbytes/libbitsandbytes_cpu.so...
/home/jianjianjianjian/miniconda3/envs/alpaca_dev/lib/python3.10/site-packages/bitsandbytes/cextension.py:31: UserWarning: The installed version of bitsandbytes was compiled without GPU support. 8-bit optimizers and GPU quantization are unavailable.

AngainorDev commented 1 year ago

I find that whether I add lora_weights or not, the inference results is same

With an instruction like "Tell me about Alpacas" ?

Are you sure the base model you use is the raw one, not an already merged alpaca?

LiuPearl1 commented 1 year ago

@AngainorDev I have asked four instruction, like “what color is the sky”， “Give me a python code which can sum values in a list”，“Do your know Chatgpt”，“List 5 suggestions on how to make yourself energetic”. And with/without lora_weights, the results are same.

When I trained the model, I used the command: OMP_NUM_THREADS=4 WORLD_SIZE=4 CUDA_VISIBLE_DEVICES=0,1,2,3 torchrun --nproc_per_node=4 \ --master_port=1234 finetune.py \ --base_model ./llama-7B-hf \ --data_path ./alpaca_data_cleaned.json \ --output_dir ./lora-alpaca-multigpu

LiuPearl1 commented 1 year ago

i meet same issues.

===================================BUG REPORT===================================
Welcome to bitsandbytes. For bug reports, please submit your error trace to: https://github.com/TimDettmers/bitsandbytes/issues
================================================================================
CUDA exception! Error code: no CUDA-capable device is detected
CUDA exception! Error code: initialization error
CUDA SETUP: CUDA runtime path found: /home/jianjianjianjian/miniconda3/envs/alpaca_dev/lib/libcudart.so
/home/jianjianjianjian/miniconda3/envs/alpaca_dev/lib/python3.10/site-packages/bitsandbytes/cuda_setup/main.py:136: UserWarning: WARNING: No GPU detected! Check your CUDA paths. Proceeding to load CPU-only library...
  warn(msg)
CUDA SETUP: Detected CUDA version 117
CUDA SETUP: Loading binary /home/jianjianjianjian/miniconda3/envs/alpaca_dev/lib/python3.10/site-packages/bitsandbytes/libbitsandbytes_cpu.so...
/home/jianjianjianjian/miniconda3/envs/alpaca_dev/lib/python3.10/site-packages/bitsandbytes/cextension.py:31: UserWarning: The installed version of bitsandbytes was compiled without GPU support. 8-bit optimizers and GPU quantization are unavailable.

Your problems not same with me. You don't have GPU.

AngainorDev commented 1 year ago

Check with the "test" instruction "Tell me about alpacas".

Default (non fine tuned) llama will quickly spit out repeating stuff like

Alpacas are a type of camelid. They are native to South America. They are a domesticated animal. They are raised for their wool. They are raised in herds. They are raised for their wool. They are raised in herds. They are raised for their wool. They are raised in herds. They are raised for their wool. They are raised in herds. They are raised for their wool. They are raised in herds. They are raised for their wool. They are raised in herds. They are raised for their wool. They are raised in herds. They are raised for their wool. They are raised in herds. They are raised for their wool. They are raised in herds. They are raised for their wool.

While an Alpaca fine tune will be more consistent like

Alpacas are small, cute, and fluffy animals native to South America. They are related to camels and llamas, and are known for their soft, luxurious fleece. Alpacas are herd animals and live in family groups. They are very social and can be trained to lead people on walks. They are also very gentle and docile, making them popular as pets.

(just tested right now, commenting out as you did)

This will allows you to id the error. If you get the consistent output without the LoRA active, then your local ./llama-7B-hf is not the bare llama model, but already an alpaca model. If you get nonsense with lora, then your lora is wrong.

You can also test by replacing any of the model and/or lora by an online HF repo instead of your local directory, so you can make sure which one is wrong. lora "tloen/alpaca-lora-7b" base_model "decapoda-research/llama-7b-hf"

LiuPearl1 commented 1 year ago

just tested right now, commenting out as you did

@AngainorDev Hi, I just test the "Tell me about alpacas" instruction.

Without using lora_weights, the resultes is below:

The results is too breif, I doubt whether is the Max tokens reason, so I turn up the value, the results is same.

I doubt the original weights is not correct, but original weights was downloaded on this website:https://huggingface.co/decapoda-research/llama-7b-hf/tree/main

jiejiejj commented 1 year ago

I meet a problem. After finetuning the 7B model, I want to test the finetuned model.

In reference, I run: CUDA_VISIBLE_DEVICES=0 python generate.py \ --load_8bit \ --base_model './llama-7B-hf' \ --lora_weights './lora-alpaca-multigpu'

base_model is the original llama-7B model location, lora_weights is new generated model location:

I find that whether I add lora_weights or not, the inference results is same. I have changed temperature, it doesn't work. What's wrong with my inference?

With loading lora_weights, the code like this:

Without loading lora_weights, the code like this:

Hi! May I ask did you figure out a solution? I got exactly the same issue, the loaded lora weights have no effect on the results.

Casi11as commented 1 year ago

Same question.I found that after training, in q_proj and v_proj, All weights of lora_B are 0. So that the weights of the model before and after training are exactly the same, os the output is also the same. But I don't know why this is happening.

temp

msamogh commented 1 year ago

@Casi11as Thanks for going in and checking the projection weights of the adapter. I think that is helpful in validating that the adapter weights are indeed not being trained. I am facing the same issue. Were you able to get to the bottom of this yet?

msamogh commented 1 year ago

EDIT: This seems to fix it for me - https://github.com/tloen/alpaca-lora/issues/326#issuecomment-1507946790

Casi11as commented 1 year ago

@Casi11as Thanks for going in and checking the projection weights of the adapter. I think that is helpful in validating that the adapter weights are indeed not being trained. I am facing the same issue. Were you able to get to the bottom of this yet?

no, I don't solve it. I tried using other base models (starcoder), but it is the same.

tloen / alpaca-lora

run generate.py always generate same results #232