610265158 / face_landmark

A simple method for face alignment based on wingloss and mutitask learning :)
Apache License 2.0
251 stars 80 forks source link

求助 大神们 怎么加载之前训练的模型继续训练 #20

Open WadonLiu opened 4 years ago

WadonLiu commented 4 years ago

如题 我把下面的代码注释了

    with strategy.scope():
        if 'ShuffleNet' in cfg.MODEL.net_structure:
            model=SimpleFace_shufflenet()
        else:
            model = SimpleFace_mobilenet()
        ##run a time to build
        image = np.zeros(shape=(1, 160, 160, 3), dtype=np.float32)
        model(image)

    ###recover weights
    if cfg.MODEL.pretrained_model is not None:
        model.load_weights(cfg.MODEL.pretrained_model)

然后改成了

    model = tf.saved_model.load('./model/epoch_1_val_loss163.284302')

接着就报错了, 有好心人帮我解答一下吗? 怎么解决

[2020-01-14 10:04:33,406] [INFO] Error reported to Coordinator: '_UserObject' object is not callable
Traceback (most recent call last):
  File "/root/anaconda3/envs/tf2.0/lib/python3.7/site-packages/tensorflow_core/python/training/coordinator.py", line 297, in stop_on_exception
    yield
  File "/root/anaconda3/envs/tf2.0/lib/python3.7/site-packages/tensorflow_core/python/distribute/mirrored_strategy.py", line 879, in run
    self.main_result = self.main_fn(*self.main_args, **self.main_kwargs)
  File "/root/anaconda3/envs/tf2.0/lib/python3.7/site-packages/tensorflow_core/python/autograph/impl/api.py", line 258, in wrapper
    return func(*args, **kwargs)
  File "/disk1/face_landmark-master/lib/core/base_trainer/net_work.py", line 106, in train_step
    predictions = self.model(image, training=True)
TypeError: '_UserObject' object is not callable
Traceback (most recent call last):
  File "train.py", line 112, in <module>
    main()
  File "train.py", line 109, in main
    strategy)
  File "/disk1/face_landmark-master/lib/core/base_trainer/net_work.py", line 196, in custom_loop
    train_dist_dataset,epoch)
  File "/disk1/face_landmark-master/lib/core/base_trainer/net_work.py", line 151, in distributed_train_epoch
    self.train_step, args=(one_batch,))
  File "/root/anaconda3/envs/tf2.0/lib/python3.7/site-packages/tensorflow_core/python/distribute/distribute_lib.py", line 760, in experimental_run_v2
    return self._extended.call_for_each_replica(fn, args=args, kwargs=kwargs)
  File "/root/anaconda3/envs/tf2.0/lib/python3.7/site-packages/tensorflow_core/python/distribute/distribute_lib.py", line 1787, in call_for_each_replica
    return self._call_for_each_replica(fn, args, kwargs)
  File "/root/anaconda3/envs/tf2.0/lib/python3.7/site-packages/tensorflow_core/python/distribute/mirrored_strategy.py", line 661, in _call_for_each_replica
    fn, args, kwargs)
  File "/root/anaconda3/envs/tf2.0/lib/python3.7/site-packages/tensorflow_core/python/distribute/mirrored_strategy.py", line 196, in _call_for_each_replica
    coord.join(threads)
  File "/root/anaconda3/envs/tf2.0/lib/python3.7/site-packages/tensorflow_core/python/training/coordinator.py", line 389, in join
    six.reraise(*self._exc_info_to_raise)
  File "/root/anaconda3/envs/tf2.0/lib/python3.7/site-packages/six.py", line 696, in reraise
    raise value
  File "/root/anaconda3/envs/tf2.0/lib/python3.7/site-packages/tensorflow_core/python/training/coordinator.py", line 297, in stop_on_exception
    yield
  File "/root/anaconda3/envs/tf2.0/lib/python3.7/site-packages/tensorflow_core/python/distribute/mirrored_strategy.py", line 879, in run
    self.main_result = self.main_fn(*self.main_args, **self.main_kwargs)
  File "/root/anaconda3/envs/tf2.0/lib/python3.7/site-packages/tensorflow_core/python/autograph/impl/api.py", line 258, in wrapper
    return func(*args, **kwargs)
  File "/disk1/face_landmark-master/lib/core/base_trainer/net_work.py", line 106, in train_step
    predictions = self.model(image, training=True)
TypeError: '_UserObject' object is not callable
WadonLiu commented 4 years ago

我改成这样 又遇到了新的错误

    loaded = tf.saved_model.load('./model/epoch_1_val_loss163.284302')
    model = loaded.signatures["serving_default"]
Traceback (most recent call last):
  File "train.py", line 116, in <module>
    main()
  File "train.py", line 113, in main
    strategy)
  File "/disk1/face_landmark-master/lib/core/base_trainer/net_work.py", line 196, in custom_loop
    train_dist_dataset,epoch)
  File "/disk1/face_landmark-master/lib/core/base_trainer/net_work.py", line 151, in distributed_train_epoch
    self.train_step, args=(one_batch,))
  File "/root/anaconda3/envs/tf2.0/lib/python3.7/site-packages/tensorflow_core/python/distribute/distribute_lib.py", line 760, in experimental_run_v2
    return self._extended.call_for_each_replica(fn, args=args, kwargs=kwargs)
  File "/root/anaconda3/envs/tf2.0/lib/python3.7/site-packages/tensorflow_core/python/distribute/distribute_lib.py", line 1787, in call_for_each_replica
    return self._call_for_each_replica(fn, args, kwargs)
  File "/root/anaconda3/envs/tf2.0/lib/python3.7/site-packages/tensorflow_core/python/distribute/mirrored_strategy.py", line 661, in _call_for_each_replica
    fn, args, kwargs)
  File "/root/anaconda3/envs/tf2.0/lib/python3.7/site-packages/tensorflow_core/python/distribute/mirrored_strategy.py", line 196, in _call_for_each_replica
    coord.join(threads)
  File "/root/anaconda3/envs/tf2.0/lib/python3.7/site-packages/tensorflow_core/python/training/coordinator.py", line 389, in join
    six.reraise(*self._exc_info_to_raise)
  File "/root/anaconda3/envs/tf2.0/lib/python3.7/site-packages/six.py", line 696, in reraise
    raise value
  File "/root/anaconda3/envs/tf2.0/lib/python3.7/site-packages/tensorflow_core/python/training/coordinator.py", line 297, in stop_on_exception
    yield
  File "/root/anaconda3/envs/tf2.0/lib/python3.7/site-packages/tensorflow_core/python/distribute/mirrored_strategy.py", line 879, in run
    self.main_result = self.main_fn(*self.main_args, **self.main_kwargs)
  File "/root/anaconda3/envs/tf2.0/lib/python3.7/site-packages/tensorflow_core/python/autograph/impl/api.py", line 258, in wrapper
    return func(*args, **kwargs)
  File "/disk1/face_landmark-master/lib/core/base_trainer/net_work.py", line 106, in train_step
    predictions = self.model(image, training=True)
  File "/root/anaconda3/envs/tf2.0/lib/python3.7/site-packages/tensorflow_core/python/eager/function.py", line 1081, in __call__
    return self._call_impl(args, kwargs)
  File "/root/anaconda3/envs/tf2.0/lib/python3.7/site-packages/tensorflow_core/python/eager/function.py", line 1120, in _call_impl
    list(kwargs.keys()), list(self._arg_keywords)))
TypeError: Keyword arguments ['training'] unknown. Expected ['images'].
WadonLiu commented 4 years ago

最后我把 保存的方法换成这个了

self.model.save_weights(filepath="./model/" + current_model_saved_name)

加载用这个

model.load_weights(cfg.MODEL.pretrained_model)

不知道有没有更好的方法