tensorflow / similarity

TensorFlow Similarity is a python package focused on making similarity learning quick and easy.
Apache License 2.0
1.01k stars 104 forks source link

Classification Vs. Retrieval #283

Closed BakhtawarRehman closed 2 years ago

BakhtawarRehman commented 2 years ago

I am trying to do classification using tfsimilarity. The results are extremely good for retrieval. However, model.predict() is far far off.

Below is the network used: x = tfsim.architectures.EfficientNetSim( (224, 224, 3), embedding_size = 36 #There are 36 classes, trainable='partial', include_top=True, l2_norm= False, pooling="avg", gem_p=math.inf, )

loss = loss = tfsim.losses.MultiSimilarityLoss(distance="euclidean" ) model.compile(optimizer=tf.keras.optimizers.Adam(LR), loss=loss, metrics=[tfsim.training_metrics.avg_pos(distance=distance), tfsim.training_metrics.avg_neg(distance=distance)])`

Above network val_loss: 0.0388:

Also, i realized that embeddings must be equal to the number of classes for classification ( np.argmax(predictions[i]) ). So we get output of same dim for classification. Having a higher or lower embedding size will result in different shaped output.


I am trying to run the model on an edge device. However, FOR RETRIEVAL, indexing is necessary otherwise model.lookup(x) just spits an empty array. Is there a way to use this without indexing or calibration? I have seen another issue where it is mentioned that indexing is not necessary; this just make it more efficient. However, model.lookup() return empty [[]] without indexing.

Many Thanks for this great resource: "tfSimilarity".

owenvallis commented 2 years ago

Hi @BakhtawarRehman. I think you may be conflating the embedding output and softmax multi-class outputs. The embedding output dimension describes the space in which we can place the embedded examples and does not need to equal the number of classes. For example, you can 1000 classes and use a 2-dimensional output embedding. This can be useful if you want to visualize the embedded points; however, the limited number of dimensions may allow the embedding to effectively separate the examples. Adding more dims makes it easier to separate the classes, but again, think of the examples as vectors embedded in your space rather than each dimension encoding a single class.

Regarding your second question, you can load the model as a standard Keras model. This removes the need for the nmslib deps and also removes all the extra methods you get with the Similarity model. However, you can just call predict on the inputs and then handle the neighbor look up using some other aprox NN tool. This could include some thing as simple as storing an index of examples on device and then computing the 1 - inner_product to get the cosine distance (assuming you trained with cosine dist).