CESNET / UltraGrid

UltraGrid low-latency audio and video network transmission system
http://www.ultragrid.cz
Other
489 stars 55 forks source link

Suggestion to get thumbnails to web interface #324

Open TheSashmo opened 1 year ago

TheSashmo commented 1 year ago

I am trying to experiment on a headless solution where I can put data from the console output of UG, and send that to a local web interface where I can see the logs and remotely control everything. The question came up, "why don't you get the video in the browser, or at least thumbnails" that seems like a great idea, but other than trying to create another encode and re-directing locally to the browser, does anyone have a suggestion of maybe even a 1 FPS or less way to pass that to a local web server where a user can see stuff there? I already take the text data and make volume bars and some other features, but adding the thumbs would be the best solution as I think it wouldn't take much more horsepower to do it.

Update, I know I have asked this before, but I just putting it out in the universe again! That old github suggestion couldn't even get running it was so outdated.

Update, to be clear, what I need is to grab thumbnails of the encoder side and the decoder side. I can handle all the web stuff after that.

thoughts/suggestions?

mpiatka commented 1 year ago

So in other words, you'd like a tool that periodically saves a miniaturized frame somewhere to a file which will get read by the webserver. Is that correct?

mpiatka commented 1 year ago

I've added a rough draft of a tool on the thumbnailgen branch in my fork. It's using the same infrastructure as the previews in the GUI and saves the frames as jpeg files. Right now there is a hard coded limit of 1 frame per second.

To build it you need libjpeg installed.

cd tools
make thumbnailgen

The tool takes 2 arguments - path of the unix socket it should create and the path of the output picture.

./thumbnailgen /tmp/thumbnailgen_sock pic.jpg

UltraGrid can then be started with the preview display or capture filter (to get thumbnails for the captured video that is being sent use capture filter, for video received from network use the display). For example:

./bin/uv -t testcard -d preview:path=/tmp/thumbnailgen_sock:target_size=256x256

Let me know if this works for your use case so I can polish it a bit more and merge into master.

TheSashmo commented 1 year ago

This is very close to what I was going to do. Since I use multicast a lot, I was going to output UG to multicast and then use ffmpeg to grab I frames and make thumbnails. This way seems a lot better. I will try to compile and put it into my test setup and see how much more CPU is used for that. I think having the option for FPS would be worthy considering different setups.

TheSashmo commented 1 year ago

Works great!.... Really great.

So then I will ask another question at the same time.

Right now I have an application that looks at the log output for the audio levels, and I have that pushing via a socket connection to my browser. Similar to how this jpg is done. Would there be something like what you have for this picture via the display be available for audio too? BTW, I also use the log parsing to get the transmission statistics.

mpiatka commented 1 year ago

Glad it works for you. I've cleanup up the code a bit, made the fps limit configurable (accepts also floating point, so it's possible to specify 0.1 for 1 frame per 10 seconds) and merged it into master.

As for the audio levels - right now the meter widgets in the QT GUI get that info via the control port. When "stats on" is written to the control port, it should start sending messages like

stats ARECV volrms0 -18 volpeak0 -14.9897 volrms1 -18 volpeak1 -14.9897

Those are the levels of the received audio. If you also want stats for the audio that is being sent, you need to launch uv with the --audio-filter controlport_stats parameter. These reports are sent on every audio "frame" so you'll probably need to rate limit them a bit.

TheSashmo commented 1 year ago

I was testing this locally on one box encode with the thumbnails, but the moment I add the -d and make a connection with the decoder on the same box, the thumbs no longer update. I tired moving the -d position in the command line but that didn't help. I am sure I am doing something wrong here...

Encoder: ./UltraGrid-continuous-x86_64.AppImage -t decklink:0 -d preview:path=/tmp/thumbnailgen_sock:target_size=512x512 -s embedded -c libavcodec:encoder=libx264:bitrate=20000k:preset=ultrafast --audio-codec=MP3:sample_rate=48000:bitrate=256k --audio-capture-format channels=16 -m 1316 -P 10000 192.168.99.199

Decocder: ./UltraGrid-continuous-x86_64.AppImage -d decklink:device=0:drift_fix -r embedded 192.168.99.199 -P 10000

mpiatka commented 1 year ago

I was planning to mention this in more detail, but looks like I forgot, sorry. To generate thumbnails on the encoder side it would be more appropriate to use the preview capture filter (which needs to be before the -t parameter):

./UltraGrid-continuous-x86_64.AppImage --capture-filter preview:path=/tmp/thumbnailgen_sock:target_size=512x512 -t decklink:0 -s embedded -c libavcodec:encoder=libx264:bitrate=20000k:preset=ultrafast --audio-codec=MP3:sample_rate=48000:bitrate=256k --audio-capture-format channels=16 -m 1316 -P 10000 192.168.99.199

The difference is that the capture filter runs in the sending part of UltraGrid before it ever touches the network, while the -d preview runs in the receiving part of UltraGrid. If you specify the destination address of the decoding machine on the encoder, then the stream is sent there instead and therefore the -d preview doesn't receive anything.

TheSashmo commented 1 year ago

Using release from yesterday....

[capture filter preview] Unable to convert

mpiatka commented 1 year ago

Looks like the pixel conversion from the captured format into RGB is missing. What format are you capturing in? R10k or v210?

TheSashmo commented 1 year ago

For this example is v210 but it could be any. It’s unknown at the moment, I normally use auto mode always.

mpiatka commented 1 year ago

I've just added conversions for both v210 and R10k. All formats that can come out of decklink should be covered now.

TheSashmo commented 1 year ago

Thanks. I see it working now with the newly added FPS, but the resulting thumbnails is not even legible compared to before. 256 and 512 examples below.

./UltraGrid-continuous-x86_64.AppImage --capture-filter preview:path=/tmp/thumbnailgen_sock:target_size=512x512 -t decklink:0 -s embedded -c libavcodec:encoder=libx264:bitrate=20000k:preset=ultrafast --audio-codec=MP3:sample_rate=48000:bitrate=256k --audio-capture-format channels=16 -m 1316 -P 10000 192.168.99.199

pic (1) pic

TheSashmo commented 1 year ago

Could this be how interlaced and progressive are being handled?

mpiatka commented 1 year ago

Actually, this is because of how the images are scaled. The GUI preview on which code this is based on was designed to have the smallest possible performance impact rather than good quality. As such it's implemented so that:

  1. The frame is scaled in the source pixel format before the conversion to RGB is made.
  2. The scaling algorithm is really primitive. It doesn't even interpret the pixels in any way, it just picks some evenly spaced pixels from the source frame and copies them to the scaled frame while skipping the in between pixels.

This usually gives good enough results, however the problem here is that in the v210 pixel format the pixels are organized in 6 pixel wide "blocks" that cannot be divided. This means that the scaling algorithm has to always keep six consecutive pixels and only skip over multiplies of six pixels, which causes the artifacts as seen in your example.

TheSashmo commented 1 year ago

I understand what you are saying. But doubtful to agree with you that’s it’s good enough. It’s not even legible. I’d sooner chase iframes in the compressed video than have that.

mpiatka commented 1 year ago

Alright, I've just added a mode (enabled using :hq option on the capture filter) to first convert the frame pixel format and do the scaling afterwards. This should make the text readable. Since this is slower, I've also added a :rate_limit=<fps> option to limit frame rate on the capture filter side as well.

TheSashmo commented 1 year ago

Thanks I will try that out and report back.,

sogorman commented 9 months ago

Does anyone happen to have an OSx binary for thumbnailgen? I can't get it to compile in OSx.

No issue complaining for Linux but we do all encoding on OSx.

TheSashmo commented 7 months ago

Update on this.... in 1.8.7 trying to do thumbnails. If I force to 8 bit 420 the option works fine.

If I leave it to audio, I get the error

[capture filter preview] Unable to convert

If I try with continuous, it works in auto, but the quality of the jpg is complete garbage.

mpiatka commented 7 months ago

the quality of the jpg is complete garbage

In the same way as the ones you posted before? Is that with or without the hq option on the --capture-filter preview (only available in continuous)?

TheSashmo commented 7 months ago

It works on the decoder side, but dosnt work on the encoder side, so its either or the other, but not both able to use the hq option.

mpiatka commented 7 months ago

I'm not entirely sure what the problem being reported is, so to make sure we are on the same page:

On the decoder side, -d preview need to be used and does not require nor support the :hq parameter (the quality should be acceptable, since the display gets frames already converted to RGB or UYVY).

On the encoder side, --capture-filter preview is used. In the stable version not all pixel formats are supported yet, which I think is the source of the [capture filter preview] Unable to convert error you reported earlier. You need to use the continuous version for the encoder side. Additionally some pixel formats need to be converted to RGB before scaling to get a nice result and that is exactly what the :hq param does (which is also available only on the continuous version).

Also, when running the preview on both the encoder and decoder sides, the socket specified need to different for both sides.