potterhsu / SVHNClassifier

A TensorFlow implementation of Multi-digit Number Recognition from Street View Imagery using Deep Convolutional Neural Networks (http://arxiv.org/pdf/1312.6082.pdf)
GNU General Public License v3.0
206 stars 74 forks source link

How to access/interpret the confidence values? #8

Closed markusjudge closed 6 years ago

markusjudge commented 6 years ago

Hello potterhsu,

the digits_predictions are accessed via: digits_predictions = tf.argmax(digits_logits, axis=2)

but how can one access the confidence values itself, so that one can apply confidence thresholding as mentioned in the paper (Goodfellow et al.)?

In my case the digits_predictions for example is a tensor of shape (1,5,11). If I understand correctly, axis 1 corresponds to the position of a digit and axis 2 contains 11 (confidence?) values for each digit (0-9 and 10 as no digit), so that tf.argmax gets those digits with the highest values. But as I looked into it, the values ranged from -16.5 to 11.5. If those are indeed the confidence values, then how can one normalize them to a range between 0 and 1? Can you help me out?

markusjudge commented 6 years ago

Ok, I think I got this. Just replace digit1 = dense with digit1 = tf.nn.softmax(dense) for each digit!

Then you can access the confidence values by: confidence = tf.reduce_max(digits_logits, axis=2)

seovchinnikov commented 6 years ago

to clarify, in your inference notebook you can do smth like this

length_logits, digits_logits = Model.inference(images, drop_rate=0.0)
length_predictions = tf.argmax(length_logits, axis=1)
digits_probs = tf.nn.softmax(digits_logits)
digits_max_probs = tf.reduce_max(digits_probs, axis=2)
digits_predictions = tf.argmax(digits_logits, axis=2)
digits_predictions_string = tf.reduce_join(tf.as_string(digits_predictions), axis=1)

and then length_predictions_val, digits_predictions_string_val, digits_max_preds_pr , images_val = sess.run([length_predictions, digits_predictions_string, digits_max_probs, images])