Qwen2 ppo - Githubissues

I use qwen2 model for both actor and reward, but I get the following exception at action_log_probs = self.actor(sequences, num_actions, attention_mask) in experience = self.experience_maker.make_experience(rand_prompts, **self.generate_kwargs)

ValueError: You are attempting to perform batched generation with padding_side='right' this may lead to unexpected behaviour for Flash Attention version of Qwen2. Make sure to calltokenizer.padding_side = 'left'before tokenizing the input.

why this exception raise and how to solve it~?

OpenLLMAI / OpenRLHF

Qwen2 ppo #333