ChenYi99 / EgoPlan

BSD 3-Clause "New" or "Revised" License
51 stars 6 forks source link

Question about Image Loading Function for Ego4D #2

Closed ShogoAkiyama closed 5 months ago

ShogoAkiyama commented 5 months ago

@ChenYi99

Hello,

I've encountered an issue when loading images for Ego4D where it seems the images are not being converted to RGB format. Specifically, I'm referring to this line in your code: https://github.com/ChenYi99/EgoPlan/blob/2a32be67434f920dc0b241dfa5280b2840445e7e/src/egoplan_video_llama_interface.py#L63

I managed to resolve the issue by adding the following line at line 62:

image = cv2.cvtColor(image, cv2.COLOR_BGR2RGB)

After applying this modification and testing your fine-tuned Video-LLaMA on EgoPlan-Bench Ego4D(out-domain), I noticed an improvement in accuracy from 41% to 43%. I was wondering if you could take a moment to review this modification.

Thank you for your time and assistance.

ChenYi99 commented 5 months ago

Thank you for being so careful. I already corrected the code three months ago, see Line 51.

ShogoAkiyama commented 5 months ago

@ChenYi99

Thank you for your response.

Upon review, I also missed the conversion of the video to RGB.

However, separate from the video, since the current image is being used as input, I think a similar RGB conversion process might be necessary before liine 63 as well. I tried this in my environment, and it also improved the accuracy of Ego4D(out-domain) with the model which you shared. https://github.com/ChenYi99/EgoPlan/blob/2a32be67434f920dc0b241dfa5280b2840445e7e/src/egoplan_video_llama_interface.py#L63

Could you please try it out when you have time?

Thank you very much.

ChenYi99 commented 5 months ago

Thanks for your feedback. I have updated the code according to your comments.