iejMac / video2dataset

Easily create large video dataset from video urls
MIT License
546 stars 65 forks source link

Clipping subsampler refactor #275

Closed MattUnderscoreZhang closed 9 months ago

MattUnderscoreZhang commented 10 months ago

I need to add ffmpeg pipes to speed up processing, and to do that I first needed to refactor some of the code to make it easier to reason about.

For context, I'm processing webvid-10M and the codebase is currently too slow. I'm hoping to speed it up by an order of magnitude by combining subsamplers into a single ffmpeg pipe operation, and adding other optimizations.

iejMac commented 10 months ago

nice, yes this is a great idea

iejMac commented 10 months ago

is this ready to go? If so could you give a high level overview of the changes?

MattUnderscoreZhang commented 10 months ago

This is basically just a refactor. I broke the code into smaller functions, added types, only process metadata once per clip (instead of once per stream), renamed some stuff for clarity, and cleaned up the segment_times collection code.

I avoided making any functional changes, but I think I see a couple of places for improvement that I'll hit in later pull requests. The ffmpeg pipe stuff is also going to come later, but I may need to refactor the other subsamplers first.

rom1504 commented 10 months ago

looks much better

I think the smaller functions could be an opportunity to add some more tests. What do you think?

rom1504 commented 10 months ago

just merged https://github.com/iejMac/video2dataset/pull/262 and having a bit of trouble to rebase here hmm

rom1504 commented 10 months ago

probably best if you handle this rebase @MattUnderscoreZhang ; I think it's mostly about replacing the code that's doing the extract subtitle by that new function

MattUnderscoreZhang commented 10 months ago

Yeah no prob, I just merged changes.

rom1504 commented 10 months ago

@MattUnderscoreZhang did you test this to produce the same results as before? this is a lot of complex code that is changed

MattUnderscoreZhang commented 9 months ago

Yes, I get the exact same results as before for my use case, which involves a frame rate resampling, resizing, cropping, and clipping.

Plus, all the unit tests pass as expected.