Open DvHuang opened 1 year ago
Make sure transformer return past_key_values,The code will judge whether it is a past process based on past_key_values, Thus affecting the shape of position_ids and input_ids.When use_cache is true, transformer will return past_key_values.
Make sure transformer return past_key_values,The code will judge whether it is a past process based on past_key_values, Thus affecting the shape of position_ids and input_ids.When use_cache is true, transformer will return past_key_values.