ShareGPT4Omni / ShareGPT4Video

An official implementation of ShareGPT4Video: Improving Video Understanding and Generation with Better Captions
https://sharegpt4video.github.io/
1.22k stars 43 forks source link

detail of slide_captioner_lmdeploy.py #24

Open meteorlium opened 2 months ago

meteorlium commented 2 months ago

In this file, when the sliding window description is generated, only the current image and the previous Caption are entered, is it not entered the previous image? Any considerations? def get_prepared_data(self,): curr_img = load_image(self.img_path_list[self.frame_ptr]) if self.frame_ptr == 0: query = 'This is the first frame of a video, describe it in detail.' else: query = "Here are the Video frame {} at {}.00 Second(s) and Video frame {} at {}.00 Second(s) of a video, describe what happend between them. What happend before is: {}".format( self.frame_ptr, int(self.frame_ptr * 2), self.frame_ptr + 1, int((self.frame_ptr + 1) * 2), self.caption_list[-1]) self.frame_ptr += 1 return (query, curr_img)

aiiph4 commented 2 months ago

+1

zehuichen123 commented 2 months ago

see #22