brainstorming/experiments , offline painted channels

dobkeratops commented 6 years ago

Just been experimenting with GIMP (it's interface can be off-putting, but for something free it's pretty amazing), seeing what it's like marking the images up using it's named channels feature (again I dont know the package inside out, it's also got various selection tools I haven't fully explored yet). I haven't looked into it's file format but I'm pretty sure one can script it to extract them somehow.

Marking them up in a paint program (with a pressure sensitive tablet) is pretty nice (because it's an incremental workflow). Still takes longer than straightforward crops of course.

I bet we could make a script to blur the license plates on export too :) (if we marked a 'license plate' channel)

I'm fairly motivated to do this because it's a easy path to getting alpha-channel sprites etc for other use; perhaps there's a way to get some synergy between such offline work and the superior collaborative potential of your site.. even if it is just dicing these up offline, and submitting crops with the channels converted into outlines. Or maybe the site could assist with managing naming conventions in channels (telling a user if they did it right, etc) . The channels could be converted into a set of associated colour coded images (if the site prefers to take compressed JPGs for bandwidth reasons..).

I could make the files available if you want to take a look at whats going through my mind r.e. "how to mark up city scenes" (I know there's also various marked up sources available in the machine learning community, I need to look what they do already) and make any suggestions .. I haven't settled on one scheme yet but have various ideas. The minimum is separating pavement/road/cars/people. (you could just make seperate channels for each unoccluded object, I also had an idea to mark the boundaries of occlusion whilst just keeping a channel per class. The main thing for these images is to seperate road & pavement, but it should be possible to get a load of 'sprites/textures' from the surroundings aswell. I was also curious to try channels for 'side/front/back' etc to inform the shape.. that sort of thing is probably more natural but I'm interested to see how much could be done in one place)

dobkeratops commented 6 years ago

with pixel level editing , appropriate control when zoomed, note the 'head/person' task is actually doable in some scenes with distant people:- although you might not get all in cases where the number is much higher

bbernhard commented 6 years ago

Marking them up in a paint program (with a pressure sensitive tablet) is pretty nice

Cool, that sounds definitely interesting! Do you use GIMP for that? (wondering if GIMP works on a tablet)

I'm fairly motivated to do this because it's a easy path to getting alpha-channel sprites etc for other use; perhaps there's a way to get some synergy between such offline work and the superior collaborative potential of your site A offline annotation tool is something that's still on my todo list..I am still hoping to find a matured annotation tool that can be easily extended in way so that one can push the annotation to the server with a single click (maybe on save?). Creating a annotation tool completely from scratch is probably pretty time intensive...and there will always be the cost of maintaining the software.

GIMP sounds definitely interesting, although I am not sure if it's the best thing for the job (you already mentioned that it's UI is pretty off-putting). I am wondering what software other companies (Tesla, Uber) are using...I guess they also need a way to (manually?) label and annotate their data?

I could make the files available if you want to take a look at whats going through my mind r.e. "how to mark up city scenes"

Yeah, would definitely interested in that! :)

At the moment I am pretty busy with other features (the skipping and blacklisting of annotations - i'll later show some PoC pictures), but I always like to brainstorm and discuss things, as it's a great way to come up with concrete features we could eventually implement.

dobkeratops commented 6 years ago

yes this was all GIMP, I just use a wacom tablet rather than a convertible laptop. As for GIMP on an android or iPad style device, I imagine there's other drawing programs and that is of course yet another possibility for a focussed labelling app .. GIMP is a complex general purpose package

GIMP sounds definitely interesting, although I am not sure if it's the best thing for the job (you already mentioned that it's UI is pretty off-putting)

right; initially it looks like hell but once you dig through it , it's got everything needed .. and I'm absolutely certain you could streamline it with scripting. It's not ideal, but it's absolutely capable..

I am wondering what software other companies (Tesla, Uber) are using...I guess they also need a way to (manually?) label and annotate their data?

.. and yes I think they have true dedicated labelling tools, including for video. but it's quite possible they just give people photoshop or something and give some script or plugin (or just plain instructions) to check the naming conventions. One weird thing about GIMP, it color codes the negative of the channel, but you could just invert to make it easier to show multiple channels simultaneously.

At the moment I am pretty busy with other features (the skipping and blacklisting of annotations

if you have time to just add a few more urban labels when you next update it, that would enable me to upload with more useful separations (road vs pavement, building vs house.. grass & bushes vs trees, bicycle / motorbike, river + boat ).. maybe you're worried about the label list getting too long, but you've got the 'most popular labels' as shortcuts in the 'add labels' mode - I think that would be enough without needing the complex 'label graph' idea

dobkeratops commented 6 years ago

(I keep reading neural nets take '100,000's of examples' to generalise; I wondered if pixel-level hints like the polygon-boundaries or channels would count as many examples for low-level feature detectors - I think cars/people/buildings/trees are reasonably identifiable by small features like leaves, window corners, wheel arches , rounded car window/body styling corners - I'm hoping the work of tracing those is almost like each edge drawn is a training hint, compared to the raw square images of the CIFAR or image-net challenges where it has background & object in the same submission )

bbernhard commented 6 years ago

right; initially it looks like hell but once you dig through it , it's got everything needed .. and I'm absolutely certain you could streamline it with scripting. It's not ideal, but it's absolutely capable..

cool! "streamline it with scripting" sounds really interesting. In the end it probably comes down to how attractive we can make the tool to the average (or more advanced) user. If it will be too complicated users probably won't use it and then the maintenance effort outweighs the benefits. But pixel based annotation is actually pretty cool - so I think it won't hurt to invest some (research) time here.

if you have time to just add a few more urban labels when you next update it, that would enable me to upload with more useful separations (road vs pavement, building vs house.. grass & bushes vs trees, bicycle / motorbike, river + boat ).. maybe you're worried about the label list getting too long, but you've got the 'most popular labels' as shortcuts in the 'add labels' mode - I think that would be enough without needing the complex 'label graph' idea

I'll add bicycle and building to the labels list during next maintenance (probably tomorrow) - completely forgot to add that during the last maintenance window.

Regarding the other labels: I think I am almost done with the API that would allow you to label any image freely. So if you don't mind waiting for that (I think I need another ~ 2weeks for that), I would prefer if we could it that way. It's not, that I don't want to add the labels, but I think that the API token support, which also will be introduced along the way, could be extremely useful.

What we are lacking a bit at the moment is to group images together depending on a use case (e.q "self driving car"). I think relying on the label alone might not be enough, so we probably need some kind of meta information (tags?) to do that. At the moment I am not really sure how to accomplish this...hopefully this is something we will eventually find out, as our dataset grows.

In the meantime, we could use API tokens to "hold the data together". As you are uploading urban/street scenes almost exclusively, we could easily filter the dataset later to find all images that were uploadad with the API token that matches your account. That way we could later add the "urban/street scene" tag/label/identifier/whatever to a bunch of uploaded images at once. Another benefit would be, that you could start working on your own donations (as soon as the UI supports that).

While I am working on the API support, you can assume that there will be a way to add your desired label(s) on image upload. So if you would like, you could already start separating the images accordingly and once I am ready, you could bulk upload them. You can either stick with the "folder name as label" approach or add a labels.txt/json file within each folder (which contains a list of all labels). I haven't settled on a format for the labels file yet - so if you have any preferences, just let me know. I guess a simple txt file - with a label per line - would probably do it for the moment, but we can of course also go with a more structured (json?) format, if you prefer.

(I keep reading neural nets take '100,000's of examples' to generalise; I wondered if pixel-level hints like the polygon-boundaries or channels would count as many examples for low-level feature detectors - I think cars/people/buildings/trees are reasonably identifiable by small features like leaves, window corners, wheel arches , rounded car window/body styling corners - I'm hoping the work of tracing those is almost like each edge drawn is a training hint, compared to the raw square images of the CIFAR or image-net challenges where it has background & object in the same submission )

that's a interesting thought - and makes pixel-based annotation even more desirable. :)

dobkeratops commented 6 years ago

What we are lacking a bit at the moment is to group images together depending on a use case (e.q "self driving car").

My thinking was that with enough labels, they would inform it - a self-driving car requires the label Road, and so on. (imagine if you could query a boolean expression.. "road | pavement" .. "road & !(car|person)" for empty roads, etc.

But I can see informing it directly ("this is for SDCs.. this is for training your cleaning robot.." etc) would open up more UI streamlining possibilities (i.e. the annotation tools, the label set..)

So if you would like, you could already start separating the images accordingly and once I am ready, you could bulk upload them.

.. ok the next set can go that way.. I've kind of got into the swing of uploading a few hundred after each trip; I can hold a few back for later. 'building' and 'bicycle' would help quite alot. for my own experiments I can of course use directories on my local drive and so on

bbernhard commented 6 years ago

My thinking was that with enough labels, they would inform it - a self-driving car requires the label Road, and so on. (imagine if you could query a boolean expression.. "road | pavement" .. "road & !(car|person)" for empty roads, etc.

yeah, right. I think that works most of the time, given that all the labels that are needed to identify the image as part of the "self driving car dataset" are already set. But I guess there will be images where we only have a label person so far, but which could also be are also part of the SDC dataset. Identifiying those images early could have the benefit, that we could customize the UI in a way works best for labeling/annotating SDC images. Another thing are stills: I guess that your pictures are way better for training a SDC neural net than e.q still images.

.. ok the next set can go that way.. I've kind of got into the swing of uploading a few hundred after each trip; I guessed so. Awesome work, btw! With all your image uploads we'll reach the 10k milestone earlier then expected :D

I try to keep up with your pace and unlock images every day. But what I noticed recently is, that there is quite a significant number of pictures where we need to do small adjustments (regarding privacy). As already mentioned that's something we need to take care of on the server side...so no worries if you upload them. At the moment I don't want to delete them, as I am still hoping to come up with a (semi)automated way to blurry faces/license plates - and those images would be great way to put the solution to the test. But as the images block the queue and I always have to skip them before I can unlock the new images, I am thinking about introducing a "quarantine" where I can put those images. That way we could keep the images, but get them out of the "image unlocking queue".

bbernhard commented 6 years ago

The labels building and bicycle should be available now.

dobkeratops commented 6 years ago

The labels building and bicycle should be available now.

That's great, thanks!

quite a significant number of pictures where we need to do small adjustments (regarding privacy).

I am thinking about introducing a "quarantine" where I can put those images.

.. quarantine sounds like the best solution now

I wonder if they can just be down-ressed as a stopgap. recently i was trying to just walk around taking photos of cars with the license plate not in view :) I'm sure the server side auto-plate recogniser could be a good solution. I also have the potential idea of a GIMP script for images with markup, but those would be a small fraction of the total set, doing that is much slower than just taking photos.

One 'safe option' might be submitting low-res versions of the images, and submitting higher res versions once at least 'face' and 'license plate' have been marked up, then those could be blurred out before upload

I guessed so. Awesome work, btw! With all your image uploads we'll reach the 10k milestone earlier then expected :D

basically I like to keep in the habit of cycling or walking on a regular basis so it's pretty easy to do, I may run out of local environments, on the other hand you could say it's impossible that one person could exhaustively survey a single city.. I was thinking about the volume required - '100,000 examples for a NN to generalise' - perhaps 10,000 photos (reachable in 20 days by one person) x '10 annotations per image' is enough to train something interesting.

I haven't messed with the frameworks like tensorflow myself yet; I'm thinking back to when I played around with CNN's myself (from-scratch in OpenCL), one reason I gave up was 'not having any interesting data'. I might try again. What would be interesting is training the pixel-level segmentation, and using it's 'current state' to guide which examples it needs to improve (i.e. pick the images it fails worst on)

dobkeratops commented 6 years ago

Interesting find in GIMP, as I suspected there are decent selection tools buried in there.. this is their 'intelligent-scissors' tool, it's a bit like your smart-annotation feature (and I'm sure photoshop has similar). you can draw a polygon and go back and split the edges/tweak the control points, then you get a selection mask (as such it can be saved as a named channel). some of the fiddlier cases it's still quicker to just paint it though.. but here you can combine tools.

ImageMonkey / imagemonkey-core

brainstorming/experiments , offline painted channels #98