We could pair meta data describing frames with all the bounding boxes detected. Since video content may contain a lot of frames, we could assign start PTS to each structure. The structure would be repeated until the next "IFrame".
you could have something like
F1 ---- F1' ----F1''----F2-----F3---F3---
Where the Fi` would be a subset of the Fi frame representing for instance a widget that has moved or has changed it state.
Create a GStreamer Simulator.
We could pair meta data describing frames with all the bounding boxes detected. Since video content may contain a lot of frames, we could assign start PTS to each structure. The structure would be repeated until the next "IFrame".
you could have something like
Where the Fi` would be a subset of the Fi frame representing for instance a widget that has moved or has changed it state.