tue-robotics / tue_robocup

RoboCup challenge implementations
https://github.com/orgs/tue-robotics/projects/2
42 stars 12 forks source link

Data augmentation with noise, occlusions, warps, shears etc #502

Closed Rayman closed 6 years ago

Rayman commented 7 years ago

Automatically augment training data with noise, occlusions, warps, shears etc. There is a way to do this, but how to activate it? (Also see http://tflearn.org/data_augmentation/)

Enabling this via the RQT train gui would be nice.

LoyVanBeek commented 7 years ago

The code we use breaks with Tensorflow 1.3 but I have some fixes on a branch locally ATM.

JosjaG commented 6 years ago

Original without any data augmentation: figure_original After setting the mirror-aumentation to true: figure_mirror_true Training takes a lot more time which is very inconvenient, but mirroring part of the data does (slightly) improve the accuracy. The increased training time is probably a CPU issue so by correctly using the GPU this might not be a problem.

Rayman commented 6 years ago

With data augmentation the bottleneck caching is disabled so we should look to see if we can train on the GPU. How long did training take?

LoyVanBeek commented 6 years ago

Besides mirroring, there are different augmentations to try out:

LoyVanBeek commented 6 years ago

The retrain-script from the Tensorflow examples offers:

Let's put these at 10% for each, see if each one improves performance and if they do, enable them all together. Later, we can decide to implement the other augmentations as well.

LoyVanBeek commented 6 years ago

To test various augmentations:

mkdir -p /tmp/inception
cd /tmp/inception
wget http://download.tensorflow.org/models/image/imagenet/inception-2015-12-05.tgz
tar -zxf inception-2015-12-05.tgz

mkdir -p ~/MEGA/data/robotics_testlabs/training_data_Josja/AugmentationTest/Crop/
mkdir -p ~/MEGA/data/robotics_testlabs/training_data_Josja/AugmentationTest/Scale/
mkdir -p ~/MEGA/data/robotics_testlabs/training_data_Josja/AugmentationTest/Brightness/
mkdir -p ~/MEGA/data/robotics_testlabs/training_data_Josja/AugmentationTest/Flip/

rosrun tensorflow_ros retrain ~/MEGA/data/robotics_testlabs/training_data_Josja/training /tmp/inception ~/MEGA/data/robotics_testlabs/training_data_Josja/AugmentationTest/Crop/  --batch=100 --steps=1000 --random_crop=10

rosrun tensorflow_ros retrain ~/MEGA/data/robotics_testlabs/training_data_Josja/training /tmp/inception ~/MEGA/data/robotics_testlabs/training_data_Josja/AugmentationTest/Scale/  --batch=100 --steps=1000 --random_scale=10

rosrun tensorflow_ros retrain ~/MEGA/data/robotics_testlabs/training_data_Josja/training /tmp/inception ~/MEGA/data/robotics_testlabs/training_data_Josja/AugmentationTest/Brightness/  --batch=100 --steps=1000 --random_brightness=10

rosrun tensorflow_ros retrain ~/MEGA/data/robotics_testlabs/training_data_Josja/training /tmp/inception ~/MEGA/data/robotics_testlabs/training_data_Josja/AugmentationTest/Flip/  --batch=100 --steps=1000 --flip_left_right=10
LoyVanBeek commented 6 years ago

I ran rosrun tensorflow_ros retrain ~/MEGA/data/robotics_testlabs/training_data_Josja/training /tmp/inception ~/MEGA/data/robotics_testlabs/training_data_Josja/AugmentationTest/Scale/ --batch=100 --steps=1000 --random_scale=10 on bob's ZBook with a GPU.

This ends in

2017-12-19 22:27:13.330667: Step 990: Train accuracy = 97.0%
2017-12-19 22:27:13.330716: Step 990: Cross entropy = 0.310233
2017-12-19 22:27:13.378443: Step 990: Validation accuracy = 94.0%
2017-12-19 22:28:01.398623: Step 999: Train accuracy = 97.0%
2017-12-19 22:28:01.398673: Step 999: Cross entropy = 0.298153
2017-12-19 22:28:01.447069: Step 999: Validation accuracy = 91.0%

The difference in train and validation accuracy is quite large, so probably overfitted:

screenshot from 2017-12-19 22-33-50

We reach a plateau around 250-300 training steps, after that the validation accuracy goes down.

LoyVanBeek commented 6 years ago

When running rosrun tensorflow_ros retrain ~/MEGA/data/robotics_testlabs/training_data_Josja/training /tmp/inception ~/MEGA/data/robotics_testlabs/training_data_Josja/AugmentationTest/Crop/ --random_crop=10 --batch=100 --steps=1000 screenshot from 2017-12-19 22-49-58

Again a similar gap in train vs. validation accuracy, so maybe overfitting.

2017-12-19 22:47:20.903366: Step 380: Train accuracy = 98.0%
2017-12-19 22:47:20.903447: Step 380: Cross entropy = 0.680191
2017-12-19 22:47:20.993919: Step 380: Validation accuracy = 94.0%
2017-12-19 22:50:23.217630: Step 390: Train accuracy = 99.0%
2017-12-19 22:50:23.217772: Step 390: Cross entropy = 0.654659
2017-12-19 22:50:23.330398: Step 390: Validation accuracy = 92.0%
^C

(had to stop because needed to go home.)

Matthijs: Trained overnight with the parameters above. The evaluation on the validation data: Final accuracy: 0.62443438914

LoyVanBeek commented 6 years ago

When running rosrun tensorflow_ros retrain ~/MEGA/data/robotics_testlabs/training_data_Josja/training /tmp/inception ~/MEGA/data/robotics_testlabs/training_data_Josja/AugmentationTest/Brightness/ --batch=100 --steps=250 --random_brightness=10:

screenshot from 2017-12-20 08-21-25

And the output:

2017-12-20 02:24:28.152544: Step 240: Train accuracy = 95.0%
2017-12-20 02:24:28.152608: Step 240: Cross entropy = 0.956711
2017-12-20 02:24:28.214349: Step 240: Validation accuracy = 91.0%
2017-12-20 02:30:48.570406: Step 249: Train accuracy = 95.0%
2017-12-20 02:30:48.570485: Step 249: Cross entropy = 0.980782
2017-12-20 02:30:48.632844: Step 249: Validation accuracy = 86.0%

Matthijs: Evaluation on the validation data: Final accuracy: 0.610859728507

MatthijsBurgh commented 6 years ago

@LoyVanBeek How about the evaluation on the validation data?

LoyVanBeek commented 6 years ago

Haven't had the time for that yet

LoyVanBeek commented 6 years ago

When running rosrun tensorflow_ros retrain ~/MEGA/data/robotics_testlabs/training_data_Josja/training /tmp/inception ~/MEGA/data/robotics_testlabs/training_data_Josja/AugmentationTest/Brightness/ --batch=100 --steps=250 --random_brightness=10:

screenshot from 2017-12-20 08-21-25

And the output:

2017-12-20 02:24:28.152544: Step 240: Train accuracy = 95.0%
2017-12-20 02:24:28.152608: Step 240: Cross entropy = 0.956711
2017-12-20 02:24:28.214349: Step 240: Validation accuracy = 91.0%
2017-12-20 02:30:48.570406: Step 249: Train accuracy = 95.0%
2017-12-20 02:30:48.570485: Step 249: Cross entropy = 0.980782
2017-12-20 02:30:48.632844: Step 249: Validation accuracy = 86.0%
LoyVanBeek commented 6 years ago

When running rosrun tensorflow_ros retrain ~/MEGA/data/robotics_testlabs/training_data_Josja/training /tmp/inception ~/MEGA/data/robotics_testlabs/training_data_Josja/AugmentationTest/Flip/ --batch=100 --steps=1000 --flip_left_right=10:

screenshot from 2017-12-20 20-16-41

2017-12-20 20:06:20.026709: Step 990: Cross entropy = 0.301570
2017-12-20 20:06:20.089091: Step 990: Validation accuracy = 85.0%
2017-12-20 20:13:18.808711: Step 999: Train accuracy = 100.0%
2017-12-20 20:13:18.808775: Step 999: Cross entropy = 0.294399
2017-12-20 20:13:18.875108: Step 999: Validation accuracy = 88.0%

But again, the network is probably overfitted after 1000 steps and certainly with a training accuracy of 100% but validation of 'only' 86%.

reinzor commented 6 years ago

Can you check on the separate validation set?

-Rein

On Wed, Dec 20, 2017 at 8:18 PM, Loy notifications@github.com wrote:

When running rosrun tensorflow_ros retrain ~/MEGA/data/robotics_testlabs/training_data_Josja/training /tmp/inception ~/MEGA/data/robotics_testlabs/training_data_Josja/AugmentationTest/Flip/ --batch=100 --steps=1000 --flip_left_right=10:

![screenshot from 2017-12-20 20-16-41](https://user-images. githubusercontent.com/709259/34224318-c502298c-e5c2-11e7- 8394-bd77e2c6fc37.png

2017-12-20 20:06:20.026709: Step 990: Cross entropy = 0.301570 2017-12-20 20:06:20.089091: Step 990: Validation accuracy = 85.0% 2017-12-20 20:13:18.808711: Step 999: Train accuracy = 100.0% 2017-12-20 20:13:18.808775: Step 999: Cross entropy = 0.294399 2017-12-20 20:13:18.875108: Step 999: Validation accuracy = 88.0%

But again, the network is probably overfitted after 1000 steps and certainly with a training accuracy of 100% but validation of 'only' 86%.

— You are receiving this because you were assigned. Reply to this email directly, view it on GitHub https://github.com/tue-robotics/tue_robocup/issues/502#issuecomment-353157199, or mute the thread https://github.com/notifications/unsubscribe-auth/AD-4l_1XtedzSoHpNAOmEvyeKOssKD0Oks5tCV17gaJpZM4QmT1l .

Rayman commented 6 years ago

TRAINING accuracy will always be 100% after enough steps.

LoyVanBeek commented 6 years ago

@MatthijsBurgh said he would run on the validation set. And yes, the network is likely to be overfitting.

MatthijsBurgh commented 6 years ago

rosrun tensorflow_ros retrain ~/MEGA/data/robotics_testlabs/training_data_Josja/training /tmp/inception ~/MEGA/data/robotics_testlabs/training_data_Josja/AugmentationTest/Flip250/ --batch=100 --steps=250 --flip_left_right=10 Result Final accuracy: 0.62443438914

rosrun tensorflow_ros retrain ~/MEGA/data/robotics_testlabs/training_data_Josja/training /tmp/inception ~/MEGA/data/robotics_testlabs/training_data_Josja/AugmentationTest/Crop250/ --random_crop=10 --batch=100 --steps=250 Result Final accuracy: 0.619909502262

rosrun tensorflow_ros retrain ~/MEGA/data/robotics_testlabs/training_data_Josja/training /tmp/inception ~/MEGA/data/robotics_testlabs/training_data_Josja/AugmentationTest/Scale/ --scale=10 --batch=100 --steps=1000 Result Final accuracy: 0.62443438914

rosrun tensorflow_ros retrain ~/MEGA/data/robotics_testlabs/training_data_Josja/training /tmp/inception ~/MEGA/data/robotics_testlabs/training_data_Josja/AugmentationTest/Scale250/ --batch=100 --steps=250 --random_scale=10 Result

LoyVanBeek commented 6 years ago

So it still sucks... :sob:

Rayman commented 6 years ago

Please don't put your batch size that high. Between 10 and 32 I think is good. I told this already a few times.

MatthijsBurgh commented 6 years ago

A lot of testing has been done, but no significant improvements have been shown.