discussion - vision net use cases, what kind of labelling tools suit

dobkeratops commented 7 years ago

Tangent from the other thread:-

Would it be worth identifying a few use-cases of vision-nets, and thinking about what kind of images, labels, organization,& tools would be useful for them; Are there any other use-cases of vision nets that are worth thinking about, beyond the examples given here? How many of these use cases would potentially be within the scope of this tool?

The big examples that spring to mind for me are:-

self driving cars , delivery robots

car types - recognising anything assists with sense of scale hence estimating distances; a human can drive without LIDAR, even without stereo vision; certain traffic rules relate to busses .. pedestrians - and intent - are they heading toward crossings; road surface vs pavement vs grass verge -where can you actually drive, also road conditions (drive carefully in ice). switch on the windscreen wipers when it's raining. road markings .. how to label those? .. where to stop, centreline road signs differences in signs between regions (USA vs Europe etc)

agricultural robots

e.g. detecting weeds for destruction - must be able to distinguish a 'weed' from a 'crop'. harvesting - 'is this ripe'. this could be both farm robots, or smaller scale assists with permaculture , production involving mixtures of plant types (hence planting/harvesting is currently more labour intensive)

calorie estimator app

there's an existing app that can guess the calorie content of a plate of food; obviously this needs to recognise types of foodstuffs, but also get a sense of scale and quantities. Could we get enough data to make a crowdsourced workalike..

AI guide dog Assist for blind/partially sighted people. describes verbally what is around a person producing an audio stream. Needs very broad labels for urban & domestic environments.. and a lot of contextual detail?

domestic cleaning robot ok some of this is still slightly sci-fi regarding joints etc, but imagine the task of cleaning - identifying dirty surfaces, distinguishing trash from items to keep

And much more down to earth:- assist for digital artists,CGI This is what interests me personally the most.. So, there are great tools for raw 'photogrametry' scanning, but scenes for games aren't built this way. Where a vision net could help: recognising textures/surface materials, 'instanceable objects' e.g. taking a scan of a region, and replacing objects with roughly correct repeated units, e.g. trees with trunks and foilage in roughly right place, instead of an accurate scan of each tree; similarly bins, street lamps, benches etc.. instead of having each one a unique scan. Repeating textures: e.g. buildings with windows.. repeat one window instead of each window being uniquely scanned..

So, this would require the basic labels 'bin', 'street lamp' etc but refined sub-types, e.g. different regions have very different styles.

Might it be too difficult to actually name the sub-types.. would we want a visual label e.g. "here's one example" (although in practice 3d artists do tend to try and think of names for the variations).

I think the idea of a texture or material label would be useful for this (not an object, but a surface property applied to certain pixels); would this have use elsewhere, e.g. identifying 'road texture' as icy vs dry would help a self-driving car.. identifying poses (sitting, walking, running) vs object types (man, woman, child; types of clothing) orientation; lighting conditions / other - shadows, 'where is the sun' , .. day vs night, weather labels (overcast , sunshine, fog, snow) , season (autumn vs spring/summer vegetation)

bbernhard commented 7 years ago

Awesome idea!

Focusing on one or more usecases could also help us to make ImageMonkey more known in the ML community.

I am not sure whether it makes sense and it's worth the effort, but I could imagine that it could make sense to create different tooling/UIs for different use cases. If I remember correctly, you once mentioned a 3D artists software where the workspace changes depending on the task you are doing. I could imagine implementing something similar for our use cases.

But even if we get all the tools in place and the database prepared for those use cases: I think the biggest problem is still: Where do we get those images? The more specific the use case it, the harder it will be to get some publicly available images. Looking at the database statistics I see quite a few annotations/validations but very few image donations (most of the donations are from a friend of mine who bulk imported a lot of CC0 licensed images).

So if we want to focus on some specific use cases, we probably need to take care of the images ourselves. One of the next big topics on my list is to get our dataset to 10k images (I'll open a new ticket to brainstorm some ideas). If we can find a use case where a lot of publicly available images exist, we could catch two birds with one stone :)

dobkeratops commented 7 years ago

r.e. plants:- ('permaculture AI') is there anything to build on http://www.visualplants.de/Links.html .. http://www.pfaf.org/user/plantsearch.aspx

ImageMonkey / imagemonkey-core

discussion - vision net use cases, what kind of labelling tools suit #45