Closed slifty closed 4 years ago
Dropping in quick notes following a call w/@slifty.
In dense summary, an ingester (or "ingest engine" if we prefer) receives a source string (file path, network address, YouTube URL, etc.), identifies the source type, probes the source for composition details, chooses from the available A/V streams (if multiple), decides if we need to normalize/transcode them, decorates it with metadata, packages it for streaming, reserves an output address, notifies the countertop coordinator, and begins streaming.
Let's assume we have an RTP stream like rtp://192.168.1.100:5000. To begin ingesting it, we run a command like yarn ingest:add rtp://192.168.1.100:5000
.
🌊
If input ends (end of file, connection closed), finish streaming the current buffer, and then die.
Most of the above can be handled by the individual ingesters themselves. However, a few things may be nice to have at a higher / meta level.
Okay that's all I really have right now for an Ingestion Coordinator.
This is a great exploration! Some thoughts / feedback:
I think there may be merit to designing the relationship between ingestion coordinator and countertop so that it is fully decoupled.
Put another way, ideally the IC wouldn't know the countertop exists to begin with / they wouldn't require a back and forth.
I see a key role of the ingestion coordinator as figuring out which "ingestion engine" to invoke for a given input stream.
I'm trying to think of a use case for a situation where a configured stream would be mounted and not consumed. I think that it might be best for the IC to assume that if the stream exists and it has been told to ingest it that the consumers should be assumed to be there / not worry about if they stop.
Likewise, countertop should probably watch for new streams and assume that their existence implies the command to consume it.
Does that all make sense?
An open question is "what is the best way to convey to the countertop that a stream is ready to process" -- this might be best done via a kakfa queue (or pair of queues: one to indicate closed streams and one to indicate open ones)
A whole lot of the above just changed! Mostly, numbers 2–4.
The ingestion engines will still receive input (file, UDP stream, etc.), verify their composition, make re-encoding decisions, package the stream metadata, etc. But rather than make a network stream available for the countertop, we're exploring blindly streaming the data into Kafka for consumption by the countertop (either by the sous chef or, potentially, directly by the line cooks if it makes enough sense for the ingester to essentially be an appliance).
In some sandbox scripts, we were able to successfully stream a local video file in chunks, create Kafka messages with the chunk data (each ~70-100K, well under the 1MB default/recommended Kafka message limit) on a specific topic, and have a consumer that listened to that topic and reassembled the chunks into an identical file. (Actually, we are still working to ensure the chunks can be reassembled in order; we suspect the message producer is adding them to the queue out of order, probably because that's currently an async process.)
Anything I missed / mischaracterized, @slifty?
This captured it I think!
We had another solid working session today which started diving into our ffmpeg integration. There are a series of open questions that I think are safe to call "optimization" level questions (e.g. the best wrapper format, do we actually wanna normalize the codec, etc), but we can punt on that a bit for the short term.
We now have a draft of the abstract ingestion engine as well as a FileIngestionEngine implementation.
The abstract engine handles setting up ffmpeg, piping data to and from it, and writing packets to kafka.
The big question remaining is around the structure of the payload that is sent to kafka, including how we handle the concept of time (do we calculate the duration of a payload? Do we track the total duration so far, etc etc)?
It may well be that the ingestion engines can either track time via data duration OR choose to function in "real time".
One of the big questions we're facing (somewhat related to #42) is how to keep track of the duration of the video we've ingested so far. I asked this StackOverflow question with the hopes to glean some wisdom from the crowd.
Task
Description The architecture outlined in #1 talks about the idea of a "ingestion engines" which will be responsible for converting external media sources into a standardized video stream for internal use.
We anticipate a few types of inputs over time, so we should create an abstract class that defines the general shape of what an "Ingestion Engine" is responsible for.
Location of this TBD but maybe put it in a new directory like
src/ingestionEngines/
?Relevant resources / research
Related Issues
14 Is an issue for the exploration of ffmpeg and stream formats that we might want to use