I use qwen2 model for both actor and reward, but I get the following exception at action_log_probs = self.actor(sequences, num_actions, attention_mask) in experience = self.experience_maker.make_experience(rand_prompts, **self.generate_kwargs)
ValueError: You are attempting to perform batched generation with padding_side='right' this may lead to unexpected behaviour for Flash Attention version of Qwen2. Make sure to calltokenizer.padding_side = 'left'before tokenizing the input.
I use qwen2 model for both actor and reward, but I get the following exception at
action_log_probs = self.actor(sequences, num_actions, attention_mask)
inexperience = self.experience_maker.make_experience(rand_prompts, **self.generate_kwargs)
ValueError: You are attempting to perform batched generation with padding_side='right' this may lead to unexpected behaviour for Flash Attention version of Qwen2. Make sure to call
tokenizer.padding_side = 'left'before tokenizing the input.
why this exception raise and how to solve it~?