peteryuX / arcface-tf2

ArcFace unofficial Implemented in Tensorflow 2.0+ (ResNet50, MobileNetV2). "ArcFace: Additive Angular Margin Loss for Deep Face Recognition" Published in CVPR 2019. With Colab.
MIT License
260 stars 60 forks source link

How can I still get loss=nan #31

Closed MBoaretto25 closed 3 years ago

MBoaretto25 commented 3 years ago

I'm using the MS-Celeb-1M dataset, downloaded from the link posted in README.md

1 . I converted the data to tfrecords following the steps provided in the documetation for binary images.

  1. My training cfg are like these: `

    general

    batch_size: 8 input_size: 112 embd_shape: 128 sub_name: 'arc_mbv2' backbone_type: 'MobileNetV2' # 'ResNet50', 'MobileNetV2' head_type: ArcHead # 'ArcHead', 'NormHead' is_ccrop: False # central-cropping or not

train

train_dataset: './data/imgs_full.tfrecord' binary_img: True num_classes: 85742 num_samples: 5822653 epochs: 100 base_lr: 0.01 w_decay: !!float 5e-4 save_steps: 100

test

test_dataset: '.test/' `

But I'm still getting loss=nan. Is is normal for the initial epochs? Is it a tfrecords error?