RifleZhang / LLaVA-Hound-DPO

121 stars 18 forks source link

video frame rate #3

Closed rese1f closed 7 months ago

rese1f commented 7 months ago

thanks your great data contribution! I'm wondering the frame fate of your extracted video? Thanks!

guilk commented 7 months ago

Hi, we use ffmpeg -loglevel quiet -i VIDEO_PATH -vf "scale=336:-1,fps=2" FRAME_ROOT for video decoding. Then we use np.linspace(0, duration-1, num_segments, dtype=int) to uniformly sample frames.

rese1f commented 7 months ago

sounds good! so the image seq in each folder are under fps=2 right @guilk and i found extremely slow when unzip each tar.gz file, is that normal?

rese1f commented 7 months ago

also notice that there is no chunk 22 in 600k 👁️

rese1f commented 7 months ago

and 29 also missing

RifleZhang commented 7 months ago

sounds good! so the image seq in each folder are under fps=2 right @guilk and i found extremely slow when unzip each tar.gz file, is that normal?

In the inference example in README, the video to frame function we used is defined in inference.inference_utils.decode2frame

We just loaded the missing chunks and fixed a typo in setup_train_data.sh. Let me know if you run into any problems. As for the speed, it probably takes about 5-10 mins to unzip everything.

rese1f commented 7 months ago

thx your fix (your typo make my /home messy lol)