Closed Ritz111 closed 1 month ago
As I used "LLaVA-RLHF-13b-v1.5-336/sft_model" instead of "llava-vicuna-v1-5-13b-336-finetune-final-padding" to initialize the reward model, an error occurred. The error info is:
File "/home/.local/lib/python3.9/site-packages/transformers/generation/utils.py", line 2678, in sample
respond_outputs = unwrapped_policy.respond(
File "LLaVA-RLHF-main/RLHF/models/rl_models.py", line 339, in respond
next_tokens = torch.multinomial(probs, num_samples=1).squeeze(1)
RuntimeError: probability tensor contains either `inf`, `nan` or element < 0
Is this error raised because of the mistake in initializing the reward model? The training codes and datasets are as same as your original code.
Hi @Ritz111, Did you installed the llava model as in https://github.com/llava-rlhf/LLaVA-RLHF/tree/main/llava_setup?
Hi @Ritz111, Did you installed the llava model as in https://github.com/llava-rlhf/LLaVA-RLHF/tree/main/llava_setup?
Sure, otherwise I cannot run the repo.
Hi @Edward-Sun, is there any update? Does "llava-vicuna-v1-5-13b-336-finetune-final-padding" equal to "LLaVA-RLHF-13b-v1.5-336/sft_model"?
Yeah "llava-vicuna-v1-5-13b-336-finetune-final-padding" is the same as "LLaVA-RLHF-13b-v1.5-336/sft_model" 👍
Hi, When I loaded the pretrained reward model, I found that the base model "llava-vicuna-v1-5-13b-336-finetune-final-padding" can not be reached. Is it uploaded on https://huggingface.co/? Can you offer a link to download it? thanks