Open abdikaiym01 opened 2 years ago
There are something I've thought about:
The class label for training is actually not same with the folder name, so if you count the sample_per_class
from the raw folder, would be a problem. May try like:
import data
import pandas as pd
dataset_path = '/datasets/faces_casia_112x112_folders'
image_names, image_classes, embeddings, classes, dest_pickle = data.pre_process_folder(dataset_path)
# Count samples by class labels
aa = pd.value_counts(image_classes).sort_index().values
# Save to npy
np.save('faces_casia_sample_per_class.npy', aa)
Then loading back in the loss function:
sample_per_class = np.load('faces_casia_sample_per_class.npy')
sample_per_class = np.log(sample_per_class)
sample_per_class = tf.convert_to_tensor(sample_per_class .astype('float32'))
...
# Apply other steps as `balanced_softmax_loss`
logits += sample_per_class
...
logits
for CosFace
or ArcFace
is in value range [-1, 1]
, but for even the small dataset like CASIA
, the log
value of sample_per_class
is in value range [0.6931472, 6.6871085]
, and most of them in range [2.7, 3.7]
, which is being too large...
aa = np.load('faces_casia_sample_per_class.npy')
aa = np.log(aa)
pd.value_counts(aa).head(20)
# 2.708050 556
# 2.772589 514
# 2.833213 495
# 2.890372 444
# 2.944439 402
# 3.044522 352
# 2.995732 350
# 2.639057 315
# 3.091042 300
# 3.135494 290
# 3.178054 258
# 3.258097 251
# 3.218876 247
# 3.295837 236
# 3.401197 202
# 3.367296 198
# 3.332205 191
# 3.433987 164
# 2.564949 160
# 3.465736 157
Ok, I have some troubles, with the сonvergence. What would you suggest to make it work well since the some datasets are very long-tailed?
I don't have many experience in handling those long-tailed classification either.
Speaking this Meta-Softmax, I think the original should work better with softmax loss. But with CosFace loss, I think it may be better using *
instead of +
combing logits
with sample_per_class
:
class CosFaceLoss(ArcfaceLossSimple):
def __init__(self, margin=0.35, scale=64.0, from_logits=True, label_smoothing=0, sample_per_class=None, **kwargs):
super(CosFaceLoss, self).__init__(margin, scale, from_logits, label_smoothing, **kwargs)
self.sample_per_class = sample_per_class
def call(self, y_true, norm_logits):
if self.batch_labels_back_up is not None:
self.batch_labels_back_up.assign(tf.argmax(y_true, axis=-1))
pick_cond = tf.cast(y_true, dtype=tf.bool)
logits = tf.where(pick_cond, norm_logits - self.margin, norm_logits)
if self.sample_per_class is not None:
logits *= self.sample_per_class # * or +
logits *= self.scale
return tf.keras.losses.categorical_crossentropy(y_true, logits, from_logits=self.from_logits, label_smoothing=self.label_smoothing)
Anyway, I didn't take any test on this...
Hello @leondgarse, I've implemeted Balanced Meta-Softmax (https://github.com/jiawei-ren/BalancedMetaSoftmax) with CosFace loss function, but I have problem with the сonvergence. Test metric fall down like AgeDB. Could you implement this on your great framework?