tslgithub / image_class

基于keras集成多种图像分类模型: VGG16、VGG19、InceptionV3、Xception、MobileNet、AlexNet、LeNet、ZF_Net、ResNet18、ResNet34、ResNet50、ResNet_101、ResNet_152、DenseNet
433 stars 145 forks source link

checkpoint问题 #12

Closed sherjy closed 4 years ago

sherjy commented 4 years ago

您好,首先很感谢您的repo对我的学习给予帮助 但是在运行过程中遇到了checkpoint的问题,以下是运行repo源程序的train.py训练中发生的错误: …… ls: cannot access 'checkpoints/ResNet152/events': No such file or directory 512 rm: cannot remove 'checkpoints/ResNet152/events': No such file or directory …… 并在上一步产生错误后,用predict.py测试结果时,发生的错误: OSError: Unable to open file (unable to open file: name = './checkpoints/VGG16/VGG16.h5', errno = 2, error message = 'No such file or directory', flags = 0, o_flags = 0)

sherjy commented 4 years ago

def train(self,X_train, X_test, y_train, y_test,model): tensorboard=TensorBoard(log_dir=self.mkdir(os.path.join(self.checkpoints,self.model_name) ))

    lr_reduce = keras.callbacks.ReduceLROnPlateau(monitor=config.monitor,
                                                  factor=0.1,
                                                  patience=config.lr_reduce_patience,
                                                  verbose=1,
                                                  mode='auto',
                                                  cooldown=0)
    early_stop = keras.callbacks.EarlyStopping(monitor=config.monitor,
                                               min_delta=0,
                                               patience=config.early_stop_patience,
                                               verbose=1,
                                               mode='auto')
    checkpoint = keras.callbacks.ModelCheckpoint(os.path.join(self.mkdir( os.path.join(self.checkpoints,self.model_name) ),self.model_name+'.h5'),
                                                 monitor=config.monitor,
                                                 verbose=1,
                                                 save_best_only=True,
                                                 save_weights_only=True,
                                                 mode='auto',
                                                 period=1)

…… 看起来checkpoint需要被训练前被加载,应该是由于没有checkpoint发生错误,想请问是不是repo里文件不全

tslgithub commented 4 years ago

rm: cannot remove 'checkpoints/ResNet152/events*': No such file or directory,这个报错,没有关系,说明你没有训练ResNet152,当你训练了ResNet152之后,就会产生events。这个代码需要你先训练,然后再predict

sherjy commented 4 years ago

Thanks,昨天下午更新的代码版本已经解决问题了,谢谢~