Open dobkeratops opened 6 years ago
more stills for which a 'road' label would be nice (actually these also remind me to suggest a shadow label, incase a detector starts finding people/bicycles etc from silhouettes.)
I wonder if there would be any value (r.e. self-driving car experiments) in road layout markup, i.e. the seemingly 'boring' empty road images could actually serve a purpose.
Definitely! I am pretty interested in the self driving car experiments that big companies are running at the moment. A thing that's bothering me however (besides the fact that some companies are rushing to get self driving cars out there as soon as possible, no matter if the technology is already performing well), is the fact that the training data is closed source. So there is no way to verify how they trained their systems. So an open source attempt would definitely something desireable.
But I think it might be worth to think a bit more about possible annotation methods and the storage of the data. I guess that the use case "self driving car" might require us to come up with different annotation methods (you already mentioned a video based annotation tool), as well as a different way of representing the data (in the database)? e.q when annotating objects we usually just mark some pixels in the image. but in order to create meaningful data for a self driving dataset we probably need a way to add semantic information as well (e.q "that bent line represents the path of the road and is not meant to be interpreted as the road itself").
Some more (random) thoughts:
I guess we probably could also do that with the current capabilities, but I think it might be worth to invest some time in order to get the full potential out of it.
Perhaps I should resurrect my javascript tool to do that (it had feature-crept into a mesh editor rather than a simple boundary markup.. perhaps this would be a use; similarly if it was extended to handle video i.e. vertices placed in time maybe you could do some interesting things with interpolation ..)
That would be great! As you already mentioned, a browser based solution would be in any case superior to a desktop app.
btw: haven't tried it yet, but this sounds also interesting http://carlvondrick.com/vatic/ However it looks like it's not maintained anymore.
. but in order to create meaningful data for a self driving dataset we probably need a way to add semantic information as well
yes I'm beginning to think that too. What I started out trying to do in that tool was labels on points, edges, or polygons; it was kind of confused between 'objects' and 'components' in the UI & datastructure. Figuring out what to store, and how to present it will take some experimentation.
(e.g. a labelled point is just an object with one vertex.. ok so the label is still 'an object')
I suppose placing points in the centre of the road would begin to allow orientating the cameras between frames (i.e. before you've meticulously clicked out the whole road boundary). points which connect to multiple lines would be de-facto junctions, I guess, but you would still want to mark more e.g. 'where to give way' etc.
a browser based solution would be in any case superior to a desktop app.
right.. and whilst we might not be able to scroll through video, there's still web-gl. It would be nice if we could make something that gradually filled out a 3d map that you could fly around - that would make the activity spatially engaging
I wonder if there would be any value (r.e. self-driving car experiments) in road layout markup, i.e. the seemingly 'boring' empty road images could actually serve a purpose. I'm sure there's plenty of clever ways of figuring this out automatically (e.g. companies are gathering video to do pointcloud mapping, some are using pricey LIDAR, and of course use of GPS.. correlating with optical flow) but would enough simple labelled data let you get from an image to a judgement on potential path - or at least give something for testing other systems I wonder if it would be feasible to categorise layouts with a finite set of options, or would all the permutations of more complex junction types just get out of hand
(I think as judgements of the whole image, these could be done at quite a low resolution? .. and whilst a completely general net might be difficult, could you in principle train nets that work in local regions, and an system just downloads ones appropriate to where it is, from some big database..)
(perhaps these would suit diagrammatic/iconic labels, easier to figure out the difference between whats meant by 'bend'/'turn' etc)
t-junction
right turn
road bends to right+left_turn
straight ahead