Welthungerhilfe / ChildGrowthMonitor

Quick, accurate data on malnutrition
GNU General Public License v3.0
78 stars 16 forks source link

Machine Learning / Deep Learning #47

Open mmatiaschek opened 6 years ago

mmatiaschek commented 6 years ago

The Learning/Prediction Pipeline

360° turn

Objective:

3 million children are dying of malnutrition every year. We need a game-changer to identify malnutrition of children, to replace the manual measures of weight and height, which are costly, slow and often inaccurate. Our mobile app scans collects 3D point clouds and video data from children to extract anthropometric measurements like

pre-test_6

Also, for rapid assessment especially in offline regions, and for cheap mass-market smartphones, classification of severely or moderate acute malnourished, normal or overweight children from video would be valuable.

Special attention is given to the fact that uncooperative children and imperfect lighting and internet connectivity need to be addressed for a useful general approach.

Our goal is to do online learning, so we can gradually improve the quality of our measurements.

Available data

Our current Dataset consist of

which amounts to 10 GB of zip compressed point cloud data

pre-test_6b

Throughout the next 6 month we will collect data of

more than 250 GB of data

Concepts to explore

Absolute millimeters from Point Clouds

SAM/MAM/Normal/Overweight Classification

mmatiaschek commented 6 years ago

Interesting work on point clouds

List of Links: https://github.com/bluebox42/3D-deep-learning

AI-Guru commented 5 years ago

After a couple of weeks working in the project I would love to share some ideas.

Multi-View CNN - Something to definitely have a look at

See for example: http://vis-www.cs.umass.edu/mvcnn/

Or in a speaking picture:

image

What does it do? It takes into account an ordered sequence of pictures. It is the 360deg view we have been talking about a couple of times.

How to turn this into a process? What are the alternatives for capturing images?

  1. Let the kid rotate a couple of degrees, then take a photo. Repeat a couple of times until you reach full circle. Could be very bad when dealing with uncooperative kids.
  2. Let the measurement-pro take rotate. Let the kid stand still and move around it. The app could automatically take pictures using the gyroscope. This would be similar to panorama-photos-stitching on smartphones, but inverted. I would at least do an experiment if you could automatically take pictures based on rotation.
  3. Create an array of n cameras in a circle and take photos with that. This is definitely something that would work well. But it is also against the app idea.

As a start we could comb through the already existing data and see if a fitting dataset could be derived from that.

I believe this can be generalized to use also pointclouds instead of pictures.

One option for showing the point: Artificial data.

You can get some inspiration from the following, excellent Apple-article: https://machinelearning.apple.com/2017/07/07/GAN.html

Bottom line: Usually you need way too many images to solve problems that are similar to ours. A solution would be to use rendered data. Apple found out that this works. And they had to add a refining step that makes the artificial data more realistic.

I did some quick research and experiments. There is already some research done that is in our domain: http://grail.cs.washington.edu/projects/digital-human/pub/allen03space-submit.pdf Looks like parametrized models. You can change body metrics (e.g. weight, height) and body features (e.g. skin tone) of 3D models.

All you need to to is to create a pipeline that automatically generates images/pointclouds from some parametrized models. I tried Blender for a couple of minutes. It can be called from the command-line, this making it perfect for automatic rendering of 3D data. Note that I did not do an extensive research. I just used Blender as an example. Here are some simple images rendered from a simple model:

0000 0010 0020 0030 0040 0050

Definitely have a look at Apple's ARKit2

Look first:

image

See this picture animated in this great article: https://medium.com/@mohams3ios01/an-introduction-to-arkit-2-world-mapping-5b38827f8ec0

ARKit2 is currently beta. It will be available in autumn. Amongst other things it makes the World-Map available to developers. This is a data-structure that maps reality into a point-cloud of some sorts based on data from the motion sensors and 2d-camera data.

Questions:

  1. What is the resolution of the World-Map? Is it high enough to facilitate Machine Learning?
  2. Can the World-Map be used for the rotating data-set generation (see above)?
  3. Is a similar functionality available on Android?
  4. How will this technology will improve over the next couple of years? Might it fit our needs in the near future?