ausocean / openfish

OpenFish is an open-source system written in GoLang for classifying marine species. Tasks involve importing video or image data, classifying and annotating data (both manually and automatically), searching, and more. It is expected that OpenFish will use utilize computer vision and machine learning techniques.
https://ausocean.github.io/openfish/
Other
6 stars 0 forks source link

Provide API to get video #114

Open scott97 opened 7 months ago

scott97 commented 7 months ago

Problem

To do any sort of computer vision classification we will need to have training data consisting of our annotations and video. Video streams currently do not let us download video, only provide a URL to the youtube video.

Solution

I propose a new API that retrieves a fragment of video associated with that videostream.

GET http://openfish.appspot.com/api/v1/videostreams/1234/media?time=03:11:36-03:11:40

When preparing training data, you could easily fetch the video fragments and annotations in N+1 http requests (N=number of training data pairs). First, fetch the annotations, with whatever filter criteria you want, then for each annotation, fetch the media using the start and end times, and videostreamId.

GET http://openfish.appspot.com/api/v1/annotations?observation[common_name]=Giant Cuttlefish
GET http://openfish.appspot.com/api/v1/videostreams/<id>/media?time=<start>-<end>

Implementation

One challenge is that our videos are within youtube and are not easily downloadable. There is a go library for downloading videos: https://github.com/kkdai/youtube, however, it downloads the whole video, not a fragment. Given we are dealing with many-hour long streams, it is not efficient to download an entire video each time the API received a request, and then discarding majority of the video.

One solution would be to cache the video and then if we had a series of requests for the same videostream but different times, we would only need to download the video once.

An alternative, and maybe a better solution would be to use YT-DLP: https://github.com/yt-dlp/yt-dlp. YT-DLP is capable of only downloading small segments of video. For example: this command downloads 4 seconds of video.

yt-dlp --download-sections "*03:11:36-03:11:40" --force-keyframes-at-cuts https://www.youtube.com/watch?v=cnF6UTroZmc

YT-DLP downloads slightly longer as it needs to start at an I-frame/keyframe. --force-keyframes-at-cuts re-encodes the video using ffmpeg so that it begins with a keyframe, chopping the video back down to length.

YT-DLP is written in python, and has dependencies on binaries such as ffmpeg. This means app-engine is not viable as it is not pure-go, however google-cloud-run and a docker image may be an alternative option.

scott97 commented 7 months ago

A proof of concept branch has been created - media-api. The API is noticably slow, because it downloads the whole video fragment, reencodes it, then sends it to the client.