deepinsight / insightface

State-of-the-art 2D and 3D Face Analysis Project
https://insightface.ai
22k stars 5.26k forks source link

LocalFileSystem: Check failed: allow_null: :Open "./models/m1-softmax-emore,1-symbol.json": No such file or directory #1410

Open cserezwan opened 3 years ago

cserezwan commented 3 years ago

Hello, I'm facing the following issue while trying to Fine-tune the Softmax model with Triplet loss: Command: CUDA_VISIBLE_DEVICES='0,1,2,3' python -u train.py --network m1 --loss triplet --lr 0.005 --pretrained ./models/m1-softmax-emore,1

Error: gpu num: 4 prefix ./models/m1-triplet-emore/model image_size [112, 112] num_classes 85742 Called with argument: Namespace(batch_size=240, ckpt=3, ctx_num=4, dataset='emore', frequent=20, image_channel=3, kvstore='device', loss='triplet', lr=0.005, lr_steps='100000,160000,220000', models_root='./models', mom=0.9, network='m1', per_batch_size=60, pretrained='./models/m1-softmax-emore,1', pretrained_epoch=1, rescale_threshold=0, verbose=2000, wd=0.0005) {'bn_mom': 0.9, 'workspace': 256, 'emb_size': 256, 'ckpt_embedding': True, 'net_se': 0, 'net_act': 'prelu', 'net_unit': 3, 'net_input': 1, 'net_blocks': [1, 4, 6, 2], 'net_output': 'GDC', 'net_multiplier': 1.0, 'val_targets': ['lfw', 'cfp_fp', 'agedb_30'], 'ce_loss': True, 'fc7_lr_mult': 1.0, 'fc7_wd_mult': 1.0, 'fc7_no_bias': False, 'max_steps': 0, 'data_rand_mirror': True, 'data_cutoff': False, 'data_color': 0, 'data_images_filter': 0, 'count_flops': True, 'memonger': False, 'loss_name': 'triplet', 'images_per_identity': 5, 'triplet_alpha': 0.3, 'triplet_bag_size': 7200, 'triplet_max_ap': 0.0, 'per_batch_size': 60, 'lr': 0.05, 'net_name': 'fmobilenet', 'dataset': 'emore', 'dataset_path': '../datasets/faces_emore', 'num_classes': 85742, 'image_shape': [112, 112, 3], 'loss': 'triplet', 'network': 'm1', 'num_workers': 1, 'batch_size': 240} loading ./models/m1-softmax-emore,1 1 Traceback (most recent call last): File "train.py", line 486, in main() File "train.py", line 482, in main train_net(args) File "train.py", line 285, in trainnet , arg_params, aux_params = mx.model.load_checkpoint( File "/home/antik/Desktop/Projects/Proctoring-AI-master/face_detection/venv/lib/python3.8/site-packages/mxnet/model.py", line 476, in load_checkpoint symbol = sym.load('%s-symbol.json' % prefix) File "/home/antik/Desktop/Projects/Proctoring-AI-master/face_detection/venv/lib/python3.8/site-packages/mxnet/symbol/symbol.py", line 2948, in load check_call(_LIB.MXSymbolCreateFromFile(c_str(fname), ctypes.byref(handle))) File "/home/antik/Desktop/Projects/Proctoring-AI-master/face_detection/venv/lib/python3.8/site-packages/mxnet/base.py", line 246, in check_call raise get_last_ffi_error() mxnet.base.MXNetError: Traceback (most recent call last): [bt] (2) /home/antik/Desktop/Projects/Proctoring-AI-master/face_detection/venv/lib/python3.8/site-packages/mxnet/libmxnet.so(MXSymbolCreateFromFile+0x61) [0x7f9b01978b51] [bt] (1) /home/antik/Desktop/Projects/Proctoring-AI-master/face_detection/venv/lib/python3.8/site-packages/mxnet/libmxnet.so(+0x91c9a1a) [0x7f9b02d0ea1a] [bt] (0) /home/antik/Desktop/Projects/Proctoring-AI-master/face_detection/venv/lib/python3.8/site-packages/mxnet/libmxnet.so(+0x91d1f21) [0x7f9b02d16f21] File "src/io/local_filesys.cc", line 209 LocalFileSystem: Check failed: allow_null: :Open "./models/m1-softmax-emore,1-symbol.json": No such file or directory

image

Please guide to solve this issue. Thanks.

deepage commented 3 years ago

you should use command like that: CUDA_VISIBLE_DEVICES='0,1,2,3' python -u train.py --network m1 --loss triplet --lr 0.005 --pretrained ./models/m1-softmax-emore --pretrained_epoch=1.

MichaelCurrie commented 2 years ago

I got this error when I passed an invalid model parameters filepath to the .load_parameters method.

In your case that seems likely to also be the case, since the filename "./models/n1-softmax-emore,1-symbol.json" contains a comma, which is not a normally used character in Windows in POSIX or Windows file systems (although it is valid).

Also: to admins: this issue should probably be closed until the poster provides further feedback or replication steps (it's been over a year).