ImageMonkey / imagemonkey-core

ImageMonkey is an attempt to create a free, public open source image dataset.
https://imagemonkey.io
47 stars 10 forks source link

where to get data for training #300

Open dobkeratops opened 2 years ago

dobkeratops commented 2 years ago

hi again,

I have started using PyTorch with my RTX3080, which seems fast enough to experiment with (I'd been put off in the past by training times)

I'm tinkering with denoising autoencoders (and eventually want to try using parts of the stable diffusion model thats all the rage now, but initially I'm experimenting with my own smaller examples); currently my intention is to make something to enhance lo-res retro/indie game art with neural nets, so I'm getting the pieces in place.. a simple runtime, and something in PyTorch for training

what I've got in mind is

I'd like to grab the imagemonkey database , to setup training for the following in particular:

Road+Pavement

left/man , right/man. left/woman,right/woman.

.. plus the entire label list

these labels and a lot of the examples are locked.. would you be able to approve them and make it accessible to be somehow There's 3000 person outlines in the format I'm after (1500 x left/right man , 1500 x left/right woman)

I'll try to setup multi-task training -

I've managed to adapt my own "DataLoader" in PyTorch for a denoising autoencoder, which I'll extend to do all this

(the repo has a simple OpenCL inference test thats intended to grow into a little library to integrate with game engines, and my PyTorch training setup)

dobkeratops commented 2 years ago

Ok I found this (I see the blog wasn't updated so there is a more recent download) https://github.com/ImageMonkey/imagemonkey-core/blob/master/public_backups/public_backups.json with ..16_6_22.zip, that should be pretty good,

hopefully that will get me the bulk of the street annotations (road, pavement) and some of the person data ,but I'd be waiting for unlock and approval of left/man, right/man, left/woman, right/woman to get the best of whats there ( I appreciate unlocking those images will take time .. if you could prioritise them based on the number of total annotations + inclusion of those specific labels (that give me orientation information) that would maximise the value per unlock)

I can't get the python library working.. what would be ideal is a plain download script without any other dependencies (docker, tensorflow) .. I remember we had a python program to upload images with custom labels which was great but I can't find that again (its on a broken laptop). Essentially I'm after the inverse of that now

I can try to dig around but if you can make a way to just get the raw images and outlines (json dumps of the type the site shows in explore, image files named with their ID?) without any other dependencies/installs that will help me out

I've gone down the PyTorch route; I'm needing to make a custom net with the ability to hint multitask learning based on as much as is available from as many images as possible:

[1] baseline , no annotations is an autoencoder (eg use all the images) [2] then a branch from the innermost latent vector can predict the whole label list, for images without annotations (eg use all the images+labels) [3] then where annotations are available , do the per pixel training (e.g. a few layers back from reconstruction in the autoencoder, to force this to keep semantic-relevant texture information)

bbernhard commented 2 years ago

Sorry for being a bit unresponsive lately - renovating my flat takes unfortunately takes away most of my free time :/

Awesome to hear that you got back to experimenting with neural nets - that sounds really interesting!

I can't get the python library working.. what would be ideal is a plain download script without any other dependencies (docker, tensorflow)

I think there's nothing like that right now, unfortunately. But I'll try to hack a small Python script together for you. What are you interested in exactly? The images together with the labels or do you want to also have the annotations?

Regarding the labels: I plan to unlock/approve a bunch of those left/* and right/* labels this weekend (had to fix a critical bug first which prevented me from doing this earlier).

dobkeratops commented 2 years ago

right good luck with the move ! I appreciate that will chew up a lot of time and focus. right I got into PyTorch, I had the 3080 but was too paranoid to use it with the GPU shortage ( . Now I've seen how much it can get done .. now thats over that fear is gone, and I've seen I can actually get results out of it that are interesting enough to engage me.

The images together with the labels or do you want to also have the annotations?

Images, Label List (including un-annotated)+ Annotations (polygons, bounding boxes): The format I saw it show in "explore -> export" through the UI is fine (it seems disabled at the moment, I guess things might be in flux).

Because I know my way around the dataset.. I hope I'll be able to setup multi-task training that makes the most of the different permutations we have.. and the graph information Basically we have many examples with "arm/man, leg/man", and others with "left/man, right/man" , and a few with both. so having 2 branches that feed off the same feature, both will complement each other. (and we have a few left/arm/man, right/arm man done explicitly).

There's someone else interested in collaborating on making nets to enhance retro games. Lots of pieces to get this working. I've written something in openCL to run conv-nets that I can integrate in-engine (thats to get going, eventually I'll need metal compute, CUDA etc), but I will stick with PyTorch as the main workhorse for actual training.

I'm also hoping to use StableDiffusion to enhance the images 'somehow' (completely open). This net is awesome BUT pretty heavyweight (5gb? of weights).

I figure we can make lightweight nets that can run in game , and the open-ended image-monkey data will focus things beyond a plain autoencoder as a starting point

You've probably heard of the controversy around Stable Diffusion - i.e. some people think the legality of their source data is dubious, and like YouTube eventually the issue will come to ahead.
This kind of use case is exactly why I'd been interested in ImageMonkey all along.. a CC0 , open ended basis for anything generative. I'm sure we're not the only people to have this idea, and our 200,000 or so annotations can be combined with other datasets , and in the end it should be possible for the community to do what StableDiffusion does with no guilt :)

Seeing Stable Diffusion going, and seeing the end of the GPU shortage has got my enthusiasm up.. all that voluntary annotating will not be in vain. And I'm also hoping general interest in these kind of use cases will get more people interested in contributing.

I also want to look into adapting my engine to render additional channels to feed the NN's. (again pre-process textures to create an embedding, essentially screen-space texturing)

I dont have photorealism in mind.. more like "playing a comic book / concept art", something useable for indie/retro games, rather than the AAA look.

dobkeratops commented 2 years ago

https://github.com/dobkeratops/convnet_stuff very early days but here's the repo I'm sharing my current experiments in. I want to write an imagemonkey data loader there. my rust game engine isn't open at the moment, but the neural enhancement parts all will be, and I'm hoping to retrofit that to MAME and other emulators etc.. too many ideas to list..
EDIT broken repo, renamed

dobkeratops commented 2 years ago

Regarding the labels: I plan to unlock/approve a bunch of those left/* and right/* labels this weekend (had to fix a critical bug first which prevented me from doing this earlier).

If you do get time to enable these - maybe you could also add "clothing", there's 1000 examples annotated ; it's useful to be precise in the difference between the person's outline and any coats, dresses etc , + surface texture, and it gives information that wouldn't be inferable from 3d scans (these polys can also be refined with the material system)

Screenshot 2022-09-30 at 04 50 23

bbernhard commented 1 year ago

short update: clothing is now unlocked. I also unlocked a bunch of left/* and right/*labels - the remaining ones should be unlocked in the next days. (before a new label gets unlocked, the CI automatically runs a lot of regression tests to make sure that everything is still fine. As hundreds of unit- and integration tests are run, those test runs take ~30min each - so it takes a bit to unlock everything :))

I also added a small Python snippet which demonstrates how to query the database and export data: https://github.com/ImageMonkey/imagemonkey-libs/blob/master/python/snippets/export.py. The script requires the X_API_TOKEN variable set in the secrets.py file (see the Readme here for details: https://github.com/ImageMonkey/imagemonkey-libs/tree/master/python/snippets)

dobkeratops commented 1 year ago

awesome, I'm making progress with PyTorch, got it set up to do use input/output image pairs training_progress WIP in this repo

https://github.com/dobkeratops/convnet_stuff

bbernhard commented 1 year ago

awesome, that looks really interesting!

dobkeratops commented 1 year ago

https://github.com/dobkeratops/convnet_stuff , the readme explains a bit more

training_progress

now I've set it up for multiple input or output images , I want this for graphics use cases (eg color + normal maps.. video frames etc). In this test I split the RGB into 2 images, and gave it some features painted out in the output, and it's learned to reconstruct, and which bits to fill in probably overfitted badly though, this is just a quick test that runs for a few mins with a dozen images. Looking forward to getting the imagemonkey data into it some more work to get it to handle multi-task training (optional branches for missing images).

it's a "u-net", encoder-decoder with skip-connections. my interest is in training ones which can be run in game (nvidia has proven this for denoising and super resolution, I want to run it at lower res for stylising )

dobkeratops commented 1 year ago

short update: clothing is now unlocked. I also unlocked a bunch of left/* and right/*labels - the remaining ones should be unlocked in the next days. (before a new label gets unlocked, the CI automatically runs a lot of regression tests to make sure that everything is still fine. As hundreds of unit- and integration tests are run, those test runs take ~30min each - so it takes a bit to unlock everything :))

I also added a small Python snippet which demonstrates how to query the database and export data: https://github.com/ImageMonkey/imagemonkey-libs/blob/master/python/snippets/export.py. The script requires the X_API_TOKEN variable set in the secrets.py file (see the Readme here for details: https://github.com/ImageMonkey/imagemonkey-libs/tree/master/python/snippets)

Running this now.. so far so good, I'm able to get a bunch of images and write JSON for the annotations.

dobkeratops commented 1 year ago

Screenshot 2022-10-06 at 08 40 04

getting images & annotations through, should be able to train on this soon enough, it'll also be great to finally be able to view them in bulk with all the annotations together like this image_thumbnails

dobkeratops commented 1 year ago

download/render code here. https://github.com/dobkeratops/convnet_stuff/tree/main/pytorch

left/right left_right_thumbnails

road/pavement, blending here assumes mutually exclusive annotations.. (not true in general but I might be able to improve what this looks like with hard coded label priority.. some are more likely background vs foreground features) road_pavement

all person part annotations , preview tries to show overlapping annotations better person_thumbnails street_thumbnails

dobkeratops commented 1 year ago

Would there be a way to use these ID's to open an image in unified? I can easily make something here to render a page with clickable links, although updating the thumbnails might be trickier. It's really easy to flick through these thumbnails to find images that need more work - and to find the best ones to train on Alternatively ,maybe there's an API to update annotations directly.. I could do them in a local tool

bbernhard commented 1 year ago

Awesome progress - thanks for sharing!

Would there be a way to use these ID's to open an image in unified? I can easily make something here to render a page with clickable links, although updating the thumbnails might be trickier. It's really easy to flick through these thumbnails to find images that need more work - and to find the best ones to train on

This should work:

https://imagemonkey.io/annotate?mode=browse&view=unified&v=2&image_id=0ab0d1d0-d5db-464c-86a3-a483f8279b3a

Alternatively ,maybe there's an API to update annotations directly.. I could do them in a local tool

There are also endpoints to label & annotate images. But especially the annotate endpoint is a bit cumbersome to use (mostly because the unified mode was added later on and the schema of the database and the API endpoints weren't fitting the unified mode nicely. So I had to work around some limitations). But I think the label endpoint(s) aren't that complicated to use. So in case you want to give it a try and add it to your tool, I could look the API calls up.

dobkeratops commented 1 year ago

works fine, thats great

dobkeratops commented 1 year ago

got that working now- it spits out an html page of thumbnails, they can be clicked, and it fires up unified editor.. awesome! that will make finding work so much easier. and once I get it training I'll be able to do the same sort of thing.. find the unlabelled images it predicted badly, and use them as priorities for annotation..

dobkeratops commented 1 year ago

WIP.. trying to train a u-net from scratch on left/right person data. early days ,I suspect this will take a while..

training_progress10 training_progress12

dobkeratops commented 1 year ago

3 hours and its getting left/right correct :) (initially it was just sharing everything green on the left regardless of orientation). this is quite encouraging , I can run an experiment like this every day.

training_progress250 training_progress252 training_progress253 training_progress111

dobkeratops commented 1 year ago

generated along with the training images https://dobkeratops.github.io/imagemonkey_pages/left_right_man_woman/html/index.html

maybe you could integrate these thumbnail summaries somewhere in the site (even if they dont update when you annotate.. I think you had concerns about how long it takes the server to do this sort of thing)

I might experiment a bit with some static layouts for browsing all the images this way

dobkeratops commented 1 year ago

trying to get data for the road/pavement task - with these thumbnail previews I can find the images with gaps and fill them . imagemonkey_pages/road Screenshot 2022-10-10 at 19 10 44