ImageMonkey / imagemonkey-core

ImageMonkey is an attempt to create a free, public open source image dataset.
https://imagemonkey.io
47 stars 10 forks source link

tutorial/documentation #197

Open bbernhard opened 6 years ago

bbernhard commented 6 years ago

It would be cool to have some (beginner friendly) documentation. This can be either a tutorial, a help page, blog post, a youtube video, a FAQ page...I think we have some pretty interesting functionalities (browse based mode, trending labels, querability of data, attributes system, task based approach,...) that are often either a bit hidden in the UI or the intention of the feature not very well documented. At the moment, most of the stuff is discussed here (which is great btw.), but it leaves out the occasional user that hasn't time to follow the disscussions on the github issues tracker.

I think there is still room for improvement UI wise, but I guess a bit more documentation won't hurt either. I think this could make the first contact with the service way more pleasant.

possible topics:

At the moment I have quite a few topics on my todo list, so I probably won't find time to invest much time into that. As english is not my native language either, I can't deny that I would be happy if someone else wants to step in ;)

(There is also the ImageMonkey blog which hasn't seen a post in a while...if someone wants to blog about something, please let me know - I am happy for every help here).

edit: If you stumble across something on the site, that could be better worded, please let me know (or create a pull request).

Thanks! :)

dobkeratops commented 6 years ago

I guess a video of use is probably the easiest form of documentation.

Just a general comment to make dropping back into this for a bit after time away... I note plenty of new labels (nice to see), and the 'current label indicator' on the bottom right (also great, fix to losing sight of the title).

I've also found the "random" mode seems to be more pleasant to use than it used to be; perhaps this is down to having crossed a critical mass of labels i.e. there are more images where you just have a few examples (less daunting), and perhaps the wider image set aswell (with the recent uploads) .. keeps the variety up. Great having both options useable

bbernhard commented 6 years ago

Great to hear! :)

Lately, I was focusing a bit more on the machine learning integration...so there wasn't that much progress on the service in the last few days. Probably the biggest improvement was the browse based validation mode. In case you haven't seen it: https://imagemonkey.io/verify?mode=browse

browse_based_validation

I think that's a way more powerful way of validating images. The search field allows to query for complex label expressions, but in my opinion it's way more powerful, if you search for a single label. That way, you can concentrate on the image itself, instead of always checking label and image. Click on an image once to mark it as "valid". Click on the image again and to mark it as "invalid". If you are not sure whether you see an object or not, do not mark the image (= basically a "skip"). After you are done marking the images, click on the "Done" button.

Regarding the machine learning framework integration: I still need to fix a few small issues, but I think the whole thing is now almost in a state where it's usable for others as well. I tried to make it as easy as possible for users - i.e there is a docker image which sets up everything that's needed for training a neural net. Inside the docker container container, there is a script called monkey which communicates with the ImageMonkey REST API and instruments the machine learning frameworks.

e.q: If you want to train a custom Mask RCNN model on the labels cat, dog and car, all you need to do is to call the monkey script with the following parameters:

monkey train --type="object-segmentation" --labels="cat|dog|car" --epochs=3500  

This automatically downloads all images that have cat, dog and car annotations and trains a Mask RCNN model for 3500 epochs. (I've recently rented a root server with a Geforce GTX 1080 graphics card and the training there takes about ~2days, before the loss plateaus at ~0.15).

Lately, I am mostly working on some small changes to the monkey script. e.q: the additional --images-per-label parameter limits the number of images per class. (I've read that it's good practice to keep the number of images per label/class balanced to avoid training bias). With the --images-per-label parameter one can set the max number of images per class. e.q: Lets assume we have 600 annotated dog images, 400 annotated cat images and 300 annotated car images. When setting --images-per-label=300 we only use 300 images of each class, although there would be more available. Hopefully this avoids training bias, until we have reached a decent, equally distributed, amount of annotations for all labels.

My main motivation for all this is to make it as easy as possible to train a neural net on our dataset. Once we reach that point, I think it should be fairly easy to create a model on a regular basis. That pre-trained model can then be made available for download on the site. I think that might be a good way to visualize how our dataset improves over time. If we base our model on other models (e.q coco dataset), I think we should be able to re-train the last layers of the model in a decent amount of time.

dobkeratops commented 6 years ago

awesome.. 2 days is reasonable, at least it isn't 2 weeks :) I have a 1080 ii could leave training (laptop and 970 for day to day use);

I'd be curious to see how far it would get with pavement vs road.. and finding the examples that confuse it most would be great to guide labelling

bbernhard commented 6 years ago

Hopefully, this also attracts other people that have more ML knowledge (I am still a novice) and/or have a powerful GPU machine that they can "share". I think it would be pretty cool, if we could distribute the workload a bit. People with big computing machines can train a model and then share the pre-trained model for download with others. The cool thing with the docker approach is, that everybody has the same training environment. If we find a bug, we can fix it centrally and everybody will benefit from the fix.

bbernhard commented 6 years ago

awesome.. 2 days is reasonable, at least it isn't 2 weeks :) I have a 1080 ii could leave training (laptop and 970 for day to day use);

If you are interested, then I can write down a short HowTo tomorrow (also need to push the latest image to dockerhub...the current one is a bit outdated :)).

The only thing you need is docker and nvidia-docker (see https://github.com/NVIDIA/nvidia-docker). The Nvidia docker runtime is needed, to access the GPU(s) inside the docker container. After you've installed the NVIDIA docker runtime, you should see your GPU information, when you run the following command: docker run --runtime=nvidia --rm nvidia/cuda:9.0-base nvidia-smi

If that works, then you are good to go :). (I've installed the docker nvidia runtime on a debian lateley...which worked pretty smoothly.)

dobkeratops commented 6 years ago

right that should hopefully go better on my my linux box, I'll have a read around..

bbernhard commented 6 years ago

@dobkeratops

After you've the nvidia-docker runtime installed, all you need to do is:

After you've started the docker container, run the following command:

monkey train --type="object-segmentation" --labels="apple|dog" --epochs=1

This will train a Mask RCNN model on the labels dog and cat for one epoch. After the training finished, the graph gets frozen and a tensorflow .pb file gets created. On my machine, it took about 3500 epochs, until the loss plateaued at 0.15 (at the moment only the last layers of an already trained model will be re-trained).

In case you want to train a model for image classification, you can use the following command: monkey train --type="image-classification" --labels="apple|dog" This will re-train a pre-trained inception v3 model on the labels cat and dog via transfer learning.