DetectionTeamUCAS / NAS_FPN_Tensorflow

NAS-FPN: Learning Scalable Feature Pyramid Architecture for Object Detection.
https://arxiv.org/abs/1904.07392
MIT License
214 stars 62 forks source link

Multi gpu training #2

Closed unlabeledData closed 3 years ago

unlabeledData commented 5 years ago

Hi~ I met a problem when I use the pre-training model you supplied 'resnet50_v1d.ckpt'.

The error is like this:

NotFoundError (see above for traceback): Tensor name "resnet50_v1d/C2/bottleneck_0/conv0/BatchNorm/beta" not found in checkpoint files /home/DATA3/user-work/NAS_FPN_Tensorflow/data/pretrained_weights/resnet50_v1d.ckpt [[Node: save/RestoreV2_15 = RestoreV2[dtypes=[DT_FLOAT], _device="/job:localhost/replica:0/task:0/device:CPU:0"](_arg_save/Const_0_0, save/RestoreV2_15/tensor_names, save/RestoreV2_15/shape_and_slices)]]

So, is the pretrain model correct? How can I fix it?

yangxue0827 commented 5 years ago

can you show your cfgs.py? @unlabeledData

unlabeledData commented 5 years ago

OK, I changed some lines in cfgs.py.

GPU_GROUP = "1,2,3,4" VERSION = 'FPN_Res50_COCO_20190503_v5' NET_NAME = 'resnet50_v1d' #NET_NAME = 'resnet_v1_50' #NET_NAME = 'MobilenetV2' ADD_BOX_IN_TENSORBOARD = True

#------------------------------------------ Train config CUDA9 = False

# -------------------------------------------- Data_preprocess_config # DATASET_NAME = 'coco' # PIXEL_MEAN = [123.68, 116.779, 103.939] # R, G, B. In tf, channel is RGB. In openCV, channel is BGR # PIXEL_MEAN_ = [0.485, 0.456, 0.406] # PIXEL_STD = [0.229, 0.224, 0.225] # IMG_SHORT_SIDE_LEN = 800 # IMG_MAX_LENGTH = 1333 # CLASS_NUM = 80 DATASET_NAME = 'pascal' PIXEL_MEAN = [123.68, 116.779, 103.939] # R, G, B. In tf, channel is RGB. In openCV, channel is BGR PIXEL_MEAN_ = [0.485, 0.456, 0.406] PIXEL_STD = [0.229, 0.224, 0.225] IMG_SHORT_SIDE_LEN = 800 IMG_MAX_LENGTH = 1333 CLASS_NUM = 20

Others is the same as your cfgs.py.

yangxue0827 commented 5 years ago

/home/DATA3/sunli-work/NAS_FPN_Tensorflow/data/pretrained_weights/resnet50_v1d.ckpt
do you put weights here? @unlabeledData

unlabeledData commented 5 years ago

Yes, I put resnet50_v1d.ckpt there.

GuHuiJian commented 3 years ago

have you resolve this problem? @unlabeledData

unlabeledData commented 3 years ago

have you resolve this problem? @unlabeledData

Not yet. I didn't use this project. Maybe you can try anther res50 pretraining models.
The authors' FPN_Tensorflow worked well.

I want to close this issue after you see the comment.