Questions Regarding Possible use for Video Game Overlays – Latency and Access to Input Signal Frame Data

a4ff7810 commented 4 years ago

(This was originally meant to go on the NeTV2 subreddit. Unfortunately, all topics seem to have been archived – including the "New Topics For Discussion" thread. So I hope posting it here instead is alright. Sorry for any inconvenience this might cause!)

Hi everyone,

I'm considering getting an NeTV2 to create overlays for video games based on content shown to the player(s). The input signals I'd like to work with are not encrypted.

While working through the idea in my head and creating some proof of concept code, I've come across two main questions I've not been able to find answers to yet. Maybe somebody can help me out :). I apologise if I've overlooked something or been too thick to put things together myself with what I found.

Quick note: I'm not currently a hardware person. I had heard of FPGAs before (and find them fascinating), but never worked with any before. On the software side I feel comfortable enough to at least prototype the idea.

With that in mind, here's what I'd really love to know before I try and get my hands on some actual hardware (I live in Germany, so getting an NeTV2 might prove relatively tricky/expensive):

1) Does adding the overlay in NeTV Classic Mode introduce any (significant) latency to the output when compared to directly displaying the input signal?

What constitutes "significant" is up for debate of course, but let's just assume that more than one frame of added latency to the output would be undesirable. That'd be roughly 16 ms at 60 fps.

To be clear: I don't mind the overlay lagging behind by considerably more than that! In fact, I am not currently expecting to update the overlay more than once or twice a second in the first place.

What I do care about is the latency of the "combined" output (input signal plus overlay) when compared to "just" displaying the input directly (i.e. without the NeTV2 present in the device chain at all).

Looking at the Crowd Supply page – specifically the diagrams under "More on Classic and Libre Modes" – I'm assuming that the input signal is pretty much just being "passed through" without any additional buffering. The overlay then gets rendered on top essentially "out of step" (which would suit me perfectly fine) using a fairly (computationally) cheap set of operations, which shouldn't add much latency either. I'm hoping this would hold true at least in NeTV Classic Mode and thus won't introduce noticeable input delay.

I hope the question makes sense. I'd be happy to try and clarify if I'm expressing myself and the idea poorly.

2) Can input frames be captured/grabbed/accessed from the Raspberry Pi while in NeTV Classic Mode on an unencrypted source?

Alternatively, since I feel like I'm unclear on a couple of things here: Is there something of a Libre/NeTV Classic Mode "hybrid"?

I'm hoping some context/examples will illustrate what I'm trying to get at here:

The overlays I want to add to the games should be based on what's being displayed on screen (i.e. the input signal) at any given time.

Imagine, for example, counting the number of red dots being shown to the player(s) on a game's mini-map (using OpenCV in my case) and displaying the resulting number somewhere else on the screen.

For this, I'll naturally need to access what's being shown to the player by the game. Since the input signal isn't encrypted, Libre Mode would seem like a good fit. However, the input and output buffering shown in the diagrams on the Crowd Supply page (see above) would probably make it less than ideal for situations in which added latency "matters" (see above).

So ideally, I'd love to "just access" the input frames from the Pi. Since I'll be using OpenCV I'd be ecstatic if there were a handle I could pass to OpenCapture. But I'd be perfectly fine with other methods as well. Say, for example, frames being dumped to (a RAM) disk, sort of like a rolling buffer/minimal queue or something along those lines.

The "worst case" scenario would see me using an HDMI splitter to duplicate the input signal and then hook up a separate (USB) capture card to the Pi, grabbing frames using that. The other duplicated input stream would then feed into the NeTV2 running in Classic Mode. Obviously that would be a pretty clunky (and more expensive) solution... I've also not yet done much research in terms of feasibility for this option. Just to be clear: I'd be happy to do my own (albeit heavily delayed and therefore less than ideal) kinda sorta but not really "alpha blending" in this case (for example rendering the desired information on top a shape/outline whose background colour roughly matches whatever it's being layered on top of). It'd be at least marginally better than poorly anti-aliased text, I reckon.

In the section "Libre Mode" under "More on Classic and Libre Modes" (see link above) I found the following: "[Libre Mode] works only with unencrypted video feeds, but has full access to the entire video stream. This lets you arbitrarily manipulate pixels in real time, either onboard or by plugging into a host [...]" (Emphasis and wording in the beginning mine.) This makes me hopeful I might be able to use the Pi to do the processing. Like I said, I don't expect more than a couple frames per second (or even seconds per frame) considering the limited power of the platform, but I would greatly prefer for my solution to be contained in a single plug-and-play/set-and-forget style box :).

Maybe it's become clear(er) why I posed the alternative version of this question at the top of this section as well. In this "hybrid mode" I would want to reap the benefits of adding as little latency to the input signal as possible, while at the same time overlaying information computed from the input. I'd be perfectly fine with buffering the input for processing, of course. Naturally, it'd be awesome if I could somehow make use of actual transparency/alpha blending, e.g. by providing rendered PNG files for the overlay input. However, I'd also happily accept "just rendering on top", especially since that's likely to be quicker than having to calculate the properly blended values.

As with question 1, I hope this one makes sense. And again, I'll gladly elaborate on any point that may have remained unclear.

Sorry for the wall of text! When I set out to write this I didn't plan for it to be this long. I'm just hugely excited for the possibilities and just want to perform a sanity check before I commit to this potential project monetarily.

I'm greatly looking forward to hearing back from anyone who's got any input on this :)!

bunnie commented 4 years ago

hi! sorry, I didn't realize the topics were archived in Reddit. So it goes, I've opened a new thread.

Answer to 1) latency:

to be clear, there are two paths for video:

1) input -> output 2) overlay -> output

For the input video, there is almost no latency added. iirc the latency from the first pixel into the machine to the first pixel out is measured in dozens of pixel times, definitely less than a single line of video.

For the overlay video, it needs to be synchronized to the input video. Thus, the overlay is delayed by about 1.5 frame times relative to its rendering time.

Typically for a video game you care about input->output path, so effectively there is no latency on this path. Many video games have commented they like this feature.

2) There is no practical way to access the frames via the Pi currently, as the back-channel from the FPGA to the Pi is just a 115,000 baud serial link (the forward channel is HDMI). However, this is probably changing soon -- I'm seeing reports that another user community, HDMI2USB, is getting close to getting a USB3.0 interface working that is compatible with the NeTV2 hardware. I haven't tried combining this into the existing design, but it's very promising that you could get frame capture/video streaming with this in place. The main caveat is you would need to purchase and install an add-on board that converts the internal PCI-E card edge connector to a USB3.0 socket.

JeremyRand commented 1 year ago

I'm seeing reports that another user community, HDMI2USB, is getting close to getting a USB3.0 interface working that is compatible with the NeTV2 hardware. I haven't tried combining this into the existing design, but it's very promising that you could get frame capture/video streaming with this in place. The main caveat is you would need to purchase and install an add-on board that converts the internal PCI-E card edge connector to a USB3.0 socket.

@bunnie Any chance you could link to those reports (or whatever newer links might have superseded them)? I'm failing to dig up details on whatever project this was.

AlphamaxMedia / netv2-ideas

Questions Regarding Possible use for Video Game Overlays – Latency and Access to Input Signal Frame Data #26