Got 10 same words in respons.

Hi, I read your work and that helped me a lot . I fineturning BLIP Decoder in my own job Visdial dialog, when I use model to generate,I got Max_length same words in all respons.

          ans_in = batch["ans_in"]
          question_states = enc_out.unsqueeze(1).repeat(1,ans_in.size(-1),1)  # (batch_size, sequence_length, hidden_size)`
          question_atts = torch.ones(question_states.size()[:-1], dtype=torch.long).to(question_states.device)
          model_kwargs = {"encoder_hidden_states": question_states, "encoder_attention_mask": question_atts}

          bos_ids = torch.full((enc_out.size(0), 1), fill_value=1, device=enc_out.device)

          outputs = self.text_decoder.generate(input_ids=bos_ids,
                                               max_length=10,
                                               min_length=1,
                                               num_beams=num_beams,
                                               # eos_token_id=self.tokenizer.sep_token_id,
                                               # pad_token_id=self.tokenizer.pad_token_id,
                                               eos_token_id=2,
                                               pad_token_id=0,
                                               **model_kwargs)

Some results: ['', '7', '7', '7', '7', '7', '7', '7', '7', '7'] ['', 'red', 'red', 'red', 'red', 'red', 'red', 'red', 'red', 'red'] ['~~', 'about', 'about', 'about', 'about', 'about', 'about', 'about', 'about', 'about'] Any guidance and suggestions can be helpful for me. Thanks.~~

salesforce / BLIP

Got 10 same words in respons. #143