Closed faberf closed 6 months ago
Only option 1 sounds reasonable to me. Generally, having multiple enumerators does not make a lot of sense. Having multiple decoders only makes sense if you have a mixed collection with multiple media types. Decoding the same document multiple times will always be less efficient than decoding it once, so that is what should be done whenever possible.
Generally I agree. @ppanopticon mentioned today that points 2 and 3 are probably supported and a good interim solution and this issue was also intended as a reply to this. Nevertheless, silently failing and not persisting anything is strange behaviour.
Update: it turns out that with video-multiple-decoders.json I was incorrectly using COMBINE when I should have used MERGE. Switching to MERGE resolves this issue.
For the use-case of extracting ASR features at a more course-grained level than clip features, it is useful to be able to define a pipeline where either:
My impression is that maybe 1 is more useful down the line, but 2 and 3 should be quite easy to implement but are currently not fully supported (or at least I haven't figured out how to implement them).
video-multiple-decoders.json This pipeline decodes videos twice and properly creates temporal metadata descriptors and segments of different granularity, however, it does not persist anything. When I remove "long-decoder-stage" from the input of "time-stage", then the pipeline only uses a single decoder and ends up persisting everything properly.
Digging into this, in line 60 in IngestionPipelineBuilder (commit ce53093f76eb5055f997b3e64ec11ce24161547e) the enumerator is not checked to have multiple outputs (as is the case for other operators in line 111) and, if necessary, wrapped in a broadcast operator. A simple fix (checking and wrapping) does not work, as the decoder expects an Enumerator as input and a BroadcastOperator is not an Enumerator.
Point 3, using multiple enumerators, also doesn't seem to work. video-multiple-enumerators.json this pipeline gives
Dangling operators are not supported
As a minor side note: it would make sense if the file metadata extraction would already work immediately after enumeration, but currently a decoding stage seems to be necessary. This is probably not a big issue in practice, though.