google / visionai

BSD 3-Clause "New" or "Revised" License
43 stars 12 forks source link

Error attempting to receive and save stream to MP4 #7

Closed jeff-wursta closed 1 year ago

jeff-wursta commented 1 year ago

We are currently sending a looping MP4 video up to a Vertex Vision AI stream and it is streaming properly. We deployed the Application and enabled the Enable output streaming feature so that we could ingest the processed stream and analyze the results.

However, when ingesting the stream using the following:

vaictl -p PROJECT_ID \
         -l us-central1 \
         -c application-cluster-0 \
         --service-endpoint visionai.googleapis.com \
receive streams video-file application-output-ed1a4bca --output .

we run into the following error:

2023-04-05 09:21:29 I20230405 13:21:29.166227    18 event_manager.cc:74] Generated event prefix: pitgt2bi
2023-04-05 09:21:29 I20230405 13:21:29.697472    19 streaming_service_capture.cc:170] Polling events
2023-04-05 09:21:34 I20230405 13:21:34.425112    19 streaming_service_capture.cc:197] Received new event: q7y9ha97
2023-04-05 09:21:42 W20230405 13:21:42.178751     8 ingester.cc:235] The capture did not stop within the given timeout and may still be working.
2023-04-05 09:21:42 E20230405 13:21:42.284181     8 ingester.cc:241] INVALID_ARGUMENT: Got a Packet with type class "protobuf" but expected "gst".; during an EventWrite write; while writing a packet into an event sink; while handling a packet filtered element
2023-04-05 09:21:42 E20230405 13:21:42.286350     8 ingester_app.cc:71] UNKNOWN: The dataflow encounterd an error. See logs for more details.; while stopping modules

May be worth noting that we are running both vaictl commands in Docker containers for testing.

Any idea(s) on what is going on here? Thanks in advance!

dchao34 commented 1 year ago

Hey!

The output stream that you're trying to read from is actually the analytics results and not a video. To get those out, you can try

# This will print packets from a stream to stdout.
# This will work for *any* stream, independent of the data type.
vaictl -p PROJECT_ID \
         -l LOCATION_ID \
         -c application-cluster-0 \
         --service-endpoint visionai.googleapis.com \
receive streams packets STREAM_ID

c.f. https://cloud.google.com/vision-ai/docs/read-stream#read-outputs

Let me know how that goes.

jeff-wursta commented 1 year ago

The output stream that you're trying to read from is actually the analytics results and not a video.

I'm interpreting this as "the output stream we want to read from should be the analytics results, which we can get from the receive streams packets command", is that correct?

If so, why is the receive streams video-file crashing? And is there a way to ingest the processed video output?

Edit: to answer your question the receive streams packets command does work like a charm, and allows us to get all of the captured metadata for the stream. However, it would be nice to get a visual representation of the output, or at least an easy way to get the individual segments the metadata refers to.

Edit 2: Maybe there isn't a way to get processed video output from the model (eg video w/ bounding boxes), and instead we may need to manufacture those manually?

dchao34 commented 1 year ago

I'm interpreting this as "the output stream we want to read from should be the analytics results, which we can get from the receive streams packets command", is that correct?

Yep!

If so, why is the receive streams video-file crashing?

This needs a better error message... what it really should've said is that you were attempting to save a video from a non video stream, rather than burying that in the log printout (you actually see this in your trace). Error messages can definitely be improved.

And is there a way to ingest the processed video output?

Currently there isn't, because the application doesn't contain a "merger" node that overlays the results. In practice, you can in fact merge/overlay these results on your own, but admittedly, this can be somewhat of a tedious process. We will have a visualization tool published around May/June, which will do some of this for you.

What kind of use case are you currently thinking about in your application? Any features that you would like to see that could help with that?

jeff-wursta commented 1 year ago

A use case we are looking into is real-time alerting when certain events/actions within a stream are identified, including the video segment or screenshot (bonus if it included bounding boxes already) of when the action took place. Since Vision AI already has the stream info, along with any and all metadata during the stream, it would be really cool if we could enable some "Enable Processed Video" feature where we could pull down processed video segments that met certain criteria, including any bounding boxes already overlayed.

As a side note, I think the Google documentation could really use some updating as well, since it even specifically calls out using the application-output-* (non-video stream) stream for saving video segments.

Thanks a lot for clarifying though, this was very insightful.

I will let you close out the ticket in case you have any other comments or questions.

dchao34 commented 1 year ago

Thanks so much for the feedback!

Will report back here if there is any new developments on this front.