getnamo / TensorFlow-Unreal-Examples

Drag and drop Unreal Engine TensorFlow examples repository.
MIT License
218 stars 44 forks source link

Predictions are not as accurate as they could be #11

Open Cobryis opened 6 years ago

Cobryis commented 6 years ago

Hey, not sure if I'm doing something wrong here or everyone gets this. With mnistSimple, every prediction after 4 is wrong just using the keyboard. 5 -> 3; 6 -> 5; 7 ->3; 8 -> 3; 9 -> 3. 0 is correct. Also drawing is giving incorrect results. The sample data is successfully downloading and the training completes. The received image comes back correct. I have Test loss:0.04954743304257281 and Test accuracy:0.9837 in the log. If I switch to the kerasCNN it improves but is still broken. Was getting 7 for pressing 9 on the keyboard. Now getting 3 for pressing 9. If i draw the numbers, for 8 I got 3. For 9 I was getting 7 but now it seems to be working.

I am on tf ue4 0.6.0 with cuda 9.0.176.1 and cudnn v7. I should note that I also had this issue on tf ue4 4 or 5 with cuda 8 and cudnn v6. I figured I'd try the examples again when you released an update for 4.18.

getnamo commented 6 years ago

I think this is largely down to how the UE4 textures are being generated for inference.

In simple terms the keyboard testing images are just images that were drawn in paint, saved to png and then imported to the editor as assets; they do not come from the test set from MNIST and may lack centering, exact thickness, and black range that the training sets used. Similarly when you draw a figure it sends the strokes to the UE4 editor which then paints the strokes to a 28x28 image with pure black color at some similar thickness. While this does result in similar looking numbers they are not exactly the same as the MNIST test data, importantly they do not have any grey-scaling to them, which networks may have trained on to help with prediction.

What this means is that the data we train the network on is not functionally the same as the data we send for inference. For simple linear regression models this may not be sufficient and so they perform significantly worse than what their training accuracies would indicate (~40% vs 90%). On the other hand when you use something like a keras CNN, the network generalizes better and is able to correctly infer even if the data used is not exactly like the one it trained on (smaller difference of ~90% vs 98%).

To get results matching the accuracy readings we would need to either train against the exact same data type we use for inference, train against a much wider variance of input types such that we force the network to become robust to noise/input variance, or add stroking pressure such that we recreate MNIST like samples more accurately in UE4.

Contributions are certainly welcome, consider adding examples which are able to handle these discrepancies or perhaps test against real MNIST samples :)

Cobryis commented 6 years ago

Thanks for responding. This is my foray into tensor flow, and I just wanted to make sure I wasn't missing some configuration or there wasn't something terribly wrong on my end before diving deeper into the plugin. Will definitely look at contributing if I'm able to get my bearings. Thanks for all your hard work!