Provide ground-truth segmented camera views

tsbertalan commented 3 years ago

Before it was killed by a short-sighted TakeTwo interactive, the DeepDrive client for GTAV has a beautiful feature of providing ground-truth segmentation of video data into various classes (vehicle, pedestrian, plant, road, curb, sign/lamp, etc.). Would something like this be possible with FS19? This would enable easy training of drivable-space segmentation networks, and make this a very popular project.

gavanderhoorn commented 3 years ago

There are mainly two sides to this:

does such data exist in FS19 (either in the rendering pipeline or somewhere else)?
would there be a way to get it out of FS19?

re 1) we don't know right now. FS19 does have a couple of developer assisting debug visualisations which may have something like what you're describing.

However, re 2) we'd be limited in any case what we can export from FS19. See #22 for our discussion about hooking the relevant parts of FS19 to get video data out. Right now we use a simple screenshot approach published as ROS sensor_msgs/Image messages. That's far from ideal, but the Lua scripting engine in FS doesn't have any concept of video data, so we're making do. #22 would improve on that, but we'd still only have a single camera view (ie: that of the player).

gavanderhoorn commented 3 years ago

To clarify: the biggest limitation we have is that FS19 doesn't support any native plugins. All mods are Lua based.

On the one hand that's great, as it's a low overhead way to get started, and mods don't require setting up a development environment with an IDE and a native SDK.

On the other hand it's also limiting, especially for mods like modROS: the Lua engine in FS19 is customised (it's not a standard LuaJIT) and rather sandboxed. As far as we know there is no access to any graphics data (other than for rendering UI overlays, such as menus and the HUD). IO in general is limited, which necessitated quite a few tricks to even be able to publish what we publish now.

We've tried reaching out to GIANTS software a few times via various channels to discuss this. From the popularity of this (admittedly) limited integration and the potential usability of a simulation environment for agricultural scenarios in general it would seem there is definitely potential here. But so far GIANTS doesn't seem interested -- or at least: they haven't responded to any of our enquiries.

tsbertalan commented 3 years ago

Yeah, I know this is a harder thing to do, for sure. I saw you were looking for people with knowledge of OGL rendering pipelines (not me!), and thought that, if you got it, this would be a natural sort of enhancement. Fingers crossed.

gavanderhoorn commented 3 years ago

Fingers crossed.

if we make any progress with #22, what you describe here may become easier.

But it would probably already be "too late", as in: the rendering pipeline likely doesn't contain that information, and reverse engineering it from a depth / stencil buffer may actually not be efficient/possible.

tsbertalan commented 3 years ago

If you can identify which texture is applied at each pixel, then you have only a mapping from a few hundred textures to a handful of classes left to learn.

But I’m still just talking in generalities, since I’ve never worked with rendering pipelines, or even game engines.

tud-cor / FS19_modROS

Provide ground-truth segmented camera views #42