In this file, when the sliding window description is generated, only the current image and the previous Caption are entered, is it not entered the previous image? Any considerations?
def get_prepared_data(self,): curr_img = load_image(self.img_path_list[self.frame_ptr]) if self.frame_ptr == 0: query = 'This is the first frame of a video, describe it in detail.' else: query = "Here are the Video frame {} at {}.00 Second(s) and Video frame {} at {}.00 Second(s) of a video, describe what happend between them. What happend before is: {}".format( self.frame_ptr, int(self.frame_ptr * 2), self.frame_ptr + 1, int((self.frame_ptr + 1) * 2), self.caption_list[-1]) self.frame_ptr += 1 return (query, curr_img)
In this file, when the sliding window description is generated, only the current image and the previous Caption are entered, is it not entered the previous image? Any considerations?
def get_prepared_data(self,): curr_img = load_image(self.img_path_list[self.frame_ptr]) if self.frame_ptr == 0: query = 'This is the first frame of a video, describe it in detail.' else: query = "Here are the Video frame {} at {}.00 Second(s) and Video frame {} at {}.00 Second(s) of a video, describe what happend between them. What happend before is: {}".format( self.frame_ptr, int(self.frame_ptr * 2), self.frame_ptr + 1, int((self.frame_ptr + 1) * 2), self.caption_list[-1]) self.frame_ptr += 1 return (query, curr_img)