Open in03 opened 3 years ago
A lot of water under the bridge since this issue was opened!
Adding split and stitch encoding (a.k.a chunking), is going to be a huge refactor of the way job objects are handled. The job ojects should be refactored anyway, so adding chunking is a good reason to do it.
My list above is a little outdated now:
- [ ] A new task to parse the original job into segments
This doesn't need to be a task. We can chunk quickly and easily in the queuer. We can even only chunk jobs that are over a certain duration. Since Celery's object primitives allow for some pretty flexible nesting, we can check the duration of the source media and if it's over a certain duration, split it into a group of tasks with a callback (i.e. a "chord"), then we can wrap unchunked job plain-Jane-tasks and chunked job chords alike in a group.
- [ ] The encoding task's group will become a chord so we can join the segments after encode
Our chunked job callbacks can be Celery tasks themselves to do the necessary ffmpeg join and cleanup. We can add a custom Celery task state for concatenating and removign temporary files to have this show up in queuer-side progress too. Each primitive has its own results and those results are accessible asynchronously, even when deeply nested. This makes it possible to accomodate chunking in our queuer side progressbar with little modification.
- [ ] Job must be pickled, not sent as JSON. We need to transport Resolve's PyRemoteObj MediaPoolItem as task results
PyRemoteObjs are in-memory references. There is no way for media pool items to survive the round-trip between queuer and worker, except for using an ID to reference the in-memory objects on the queuer.
Maybe now with Resolve 18 it would be possible to use the new unique_id
attributes to retain a reference to a media pool item across machines, but even so that would iterating all databases, projects, timelines, timeline items and finally media pool items by their respective ids. This may be just fine for a SQL database, but with the Python API, it's slow and made worse by the need for Resolve to actually open the projects and timelines to do this.
We're keeping it simple. Link if the same project is open, leave it to the user to reiterate the timelines manually when the proejct is open next time.
/cib
Branch feature/issue-9--Split-and-Stitch-Encoding created!
I've been messing around with running separate FFmpeg processes on segments of the same video file. There are a bunch of benefits here:
Chunked job structure lends to more reliable progress-metrics
I've got this working reliably locally - with no performance gains obviously since I'm running all the FFmpeg processes on the same machine.
To get this working here we'll need a few things:
A new task to parse the original job into segmentsJob must be pickled, not sent as JSON. We need to transport Resolve's PyRemoteObj MediaPoolItem as task results