cvlab-columbia / viper

Code for the paper "ViperGPT: Visual Inference via Python Execution for Reasoning"
Other
1.63k stars 117 forks source link

"AttributeError: 'VideoSegment' object has no attribute 'shape' when running GPT-3.5-turbo generated code #25

Closed knightyxp closed 6 months ago

knightyxp commented 1 year ago

Hi, I'm currently trying to implement video question-answering using the code generated by GPT-3.5-turbo, as described in the paper. However, when executing the generated code, I'm encountering an AttributeError: 'VideoSegment' object has no attribute 'shape'. The relevant portion of the code is as follows:

video_segment = VideoSegment(video)
last_frame = ImagePatch(video_segment, -1)

According to the paper, ImagePatch(video_segment, -1) should be a valid operation that retrieves the last frame from the video segment to create an ImagePatch. However, it seems like this operation is not actually implemented in the code. Could you please provide guidance on how this is supposed to work? Is there a step in the process that is not explicitly stated in the paper, such as extracting the last frame from the video_segment before passing it to ImagePatch? Any clarification would be greatly appreciated.

surisdi commented 1 year ago

Hi, could you share the specific line where the error is created? Also, could you tell me the dimensions of the variable video? Thanks!

surisdi commented 6 months ago

Marking the stale issue as resolved. Feel free to reopen if there are any updates.