Closed MattUnderscoreZhang closed 8 months ago
Can you rebase please ?
Ok, done
To describe high-level what happened here, I took overlapping logic from subset_worker and download_worker, and moved them to a common file worker.py. I did try to fully combine the two workers into a single class, but was unable to do so right now due to differences in how they load input data.
The main reason I did this was because both workers use the video processing subsamplers, which I'm trying to rewrite to not use temp file read/writes. By putting the common video processing logic in a single file, I can reduce the number of changes I have to track later.
I have tested both subset_worker and download_worker on a subset of webvid, using my usual workflow with a framerate resampling, resizing, cropping, padding, clip finding, and cutting. I did this both as a single-step download+process, and with a download + subsequent process. Both methods replicated previous results, showing that this refactor should have no functional effect.
looks pretty good to me, let's merge
Unifies SubsetWorker and DownloadWorker logic. The two classes can probably be combined at this point.
Merge after #287.