610265158 / face_landmark

A simple method for face alignment based on wingloss and mutitask learning :)
Apache License 2.0
251 stars 80 forks source link

训练出错 #32

Open yanduosha opened 4 years ago

yanduosha commented 4 years ago

训练时提示下面错误,能否帮忙看下? shajunqin@tonly-Super-Server:~/face/face_landmark-master$ python3 train.py /home/shajunqin/.local/lib/python3.6/site-packages/tensorflow/python/framework/dtypes.py:516: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'. _np_qint8 = np.dtype([("qint8", np.int8, 1)]) /home/shajunqin/.local/lib/python3.6/site-packages/tensorflow/python/framework/dtypes.py:517: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'. _np_quint8 = np.dtype([("quint8", np.uint8, 1)]) /home/shajunqin/.local/lib/python3.6/site-packages/tensorflow/python/framework/dtypes.py:518: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'. _np_qint16 = np.dtype([("qint16", np.int16, 1)]) /home/shajunqin/.local/lib/python3.6/site-packages/tensorflow/python/framework/dtypes.py:519: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'. _np_quint16 = np.dtype([("quint16", np.uint16, 1)]) /home/shajunqin/.local/lib/python3.6/site-packages/tensorflow/python/framework/dtypes.py:520: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'. _np_qint32 = np.dtype([("qint32", np.int32, 1)]) /home/shajunqin/.local/lib/python3.6/site-packages/tensorflow/python/framework/dtypes.py:525: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'. np_resource = np.dtype([("resource", np.ubyte, 1)]) /home/shajunqin/.local/lib/python3.6/site-packages/tensorboard/compat/tensorflow_stub/dtypes.py:541: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'. _np_qint8 = np.dtype([("qint8", np.int8, 1)]) /home/shajunqin/.local/lib/python3.6/site-packages/tensorboard/compat/tensorflow_stub/dtypes.py:542: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'. _np_quint8 = np.dtype([("quint8", np.uint8, 1)]) /home/shajunqin/.local/lib/python3.6/site-packages/tensorboard/compat/tensorflow_stub/dtypes.py:543: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'. _np_qint16 = np.dtype([("qint16", np.int16, 1)]) /home/shajunqin/.local/lib/python3.6/site-packages/tensorboard/compat/tensorflow_stub/dtypes.py:544: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'. _np_quint16 = np.dtype([("quint16", np.uint16, 1)]) /home/shajunqin/.local/lib/python3.6/site-packages/tensorboard/compat/tensorflow_stub/dtypes.py:545: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'. _np_qint32 = np.dtype([("qint32", np.int32, 1)]) /home/shajunqin/.local/lib/python3.6/site-packages/tensorboard/compat/tensorflow_stub/dtypes.py:550: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'. np_resource = np.dtype([("resource", np.ubyte, 1)]) [2020-03-21 13:57:05,638] [INFO] The trainer start 2020-03-21 13:57:05.639374: I tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully opened dynamic library libcuda.so.1 2020-03-21 13:57:05.643071: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1640] Found device 0 with properties: name: GeForce GTX 1080 Ti major: 6 minor: 1 memoryClockRate(GHz): 1.582 pciBusID: 0000:02:00.0 2020-03-21 13:57:05.643261: I tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully opened dynamic library libcudart.so.10.0 2020-03-21 13:57:05.644634: I tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully opened dynamic library libcublas.so.10.0 2020-03-21 13:57:05.645885: I tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully opened dynamic library libcufft.so.10.0 2020-03-21 13:57:05.646210: I tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully opened dynamic library libcurand.so.10.0 2020-03-21 13:57:05.647953: I tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully opened dynamic library libcusolver.so.10.0 2020-03-21 13:57:05.649239: I tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully opened dynamic library libcusparse.so.10.0 2020-03-21 13:57:05.653271: I tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully opened dynamic library libcudnn.so.7 2020-03-21 13:57:05.654423: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1763] Adding visible gpu devices: 0 2020-03-21 13:57:05.654785: I tensorflow/core/platform/cpu_feature_guard.cc:142] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2 FMA 2020-03-21 13:57:05.822849: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x49d5dc0 executing computations on platform CUDA. Devices: 2020-03-21 13:57:05.822906: I tensorflow/compiler/xla/service/service.cc:175] StreamExecutor device (0): GeForce GTX 1080 Ti, Compute Capability 6.1 2020-03-21 13:57:05.846545: I tensorflow/core/platform/profile_utils/cpu_utils.cc:94] CPU Frequency: 2299855000 Hz 2020-03-21 13:57:05.852721: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x4b799f0 executing computations on platform Host. Devices: 2020-03-21 13:57:05.852772: I tensorflow/compiler/xla/service/service.cc:175] StreamExecutor device (0): , 2020-03-21 13:57:05.853937: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1640] Found device 0 with properties: name: GeForce GTX 1080 Ti major: 6 minor: 1 memoryClockRate(GHz): 1.582 pciBusID: 0000:02:00.0 2020-03-21 13:57:05.854029: I tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully opened dynamic library libcudart.so.10.0 2020-03-21 13:57:05.854067: I tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully opened dynamic library libcublas.so.10.0 2020-03-21 13:57:05.854099: I tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully opened dynamic library libcufft.so.10.0 2020-03-21 13:57:05.854130: I tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully opened dynamic library libcurand.so.10.0 2020-03-21 13:57:05.854163: I tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully opened dynamic library libcusolver.so.10.0 2020-03-21 13:57:05.854199: I tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully opened dynamic library libcusparse.so.10.0 2020-03-21 13:57:05.854259: I tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully opened dynamic library libcudnn.so.7 2020-03-21 13:57:05.856208: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1763] Adding visible gpu devices: 0 2020-03-21 13:57:05.856290: I tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully opened dynamic library libcudart.so.10.0 2020-03-21 13:57:05.857873: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1181] Device interconnect StreamExecutor with strength 1 edge matrix: 2020-03-21 13:57:05.857909: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1187] 0 2020-03-21 13:57:05.857927: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1200] 0: N 2020-03-21 13:57:05.865393: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1326] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 5953 MB memory) -> physical GPU (device: 0, name: GeForce GTX 1080 Ti, pci bus id: 0000:02:00.0, compute capability: 6.1) 1 Physical GPUs, 1 Logical GPUs 2020-03-21 13:57:05.875745: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1640] Found device 0 with properties: name: GeForce GTX 1080 Ti major: 6 minor: 1 memoryClockRate(GHz): 1.582 pciBusID: 0000:02:00.0 2020-03-21 13:57:05.875858: I tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully opened dynamic library libcudart.so.10.0 2020-03-21 13:57:05.875911: I tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully opened dynamic library libcublas.so.10.0 2020-03-21 13:57:05.875954: I tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully opened dynamic library libcufft.so.10.0 2020-03-21 13:57:05.875995: I tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully opened dynamic library libcurand.so.10.0 2020-03-21 13:57:05.876037: I tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully opened dynamic library libcusolver.so.10.0 2020-03-21 13:57:05.876078: I tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully opened dynamic library libcusparse.so.10.0 2020-03-21 13:57:05.876121: I tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully opened dynamic library libcudnn.so.7 2020-03-21 13:57:05.881381: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1763] Adding visible gpu devices: 0 2020-03-21 13:57:05.881438: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1181] Device interconnect StreamExecutor with strength 1 edge matrix: 2020-03-21 13:57:05.881461: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1187] 0 2020-03-21 13:57:05.881477: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1200] 0: N 2020-03-21 13:57:05.883184: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1326] Created TensorFlow device (/device:GPU:0 with 5953 MB memory) -> physical GPU (device: 0, name: GeForce GTX 1080 Ti, pci bus id: 0000:02:00.0, compute capability: 6.1) [2020-03-21 13:57:07,515] [INFO] Reduce to /job:localhost/replica:0/task:0/device:CPU:0 then broadcast to ('/job:localhost/replica:0/task:0/device:CPU:0',). [2020-03-21 13:57:07,516] [INFO] Reduce to /job:localhost/replica:0/task:0/device:CPU:0 then broadcast to ('/job:localhost/replica:0/task:0/device:CPU:0',). [2020-03-21 13:57:07,517] [INFO] Reduce to /job:localhost/replica:0/task:0/device:CPU:0 then broadcast to ('/job:localhost/replica:0/task:0/device:CPU:0',). [2020-03-21 13:57:07,525] [INFO] Reduce to /job:localhost/replica:0/task:0/device:CPU:0 then broadcast to ('/job:localhost/replica:0/task:0/device:CPU:0',). [2020-03-21 13:57:07,526] [INFO] Reduce to /job:localhost/replica:0/task:0/device:CPU:0 then broadcast to ('/job:localhost/replica:0/task:0/device:CPU:0',). [2020-03-21 13:57:07,526] [INFO] Reduce to /job:localhost/replica:0/task:0/device:CPU:0 then broadcast to ('/job:localhost/replica:0/task:0/device:CPU:0',). Traceback (most recent call last): File "train.py", line 108, in main() File "train.py", line 51, in main model(image) File "/home/shajunqin/.local/lib/python3.6/site-packages/tensorflow/python/keras/engine/base_layer.py", line 712, in call outputs = self.call(inputs, *args, kwargs) File "/home/shajunqin/.local/lib/python3.6/site-packages/tensorflow/python/eager/def_function.py", line 416, in call self._initialize(args, kwds, add_initializers_to=initializer_map) File "/home/shajunqin/.local/lib/python3.6/site-packages/tensorflow/python/eager/def_function.py", line 359, in _initialize *args, *kwds)) File "/home/shajunqin/.local/lib/python3.6/site-packages/tensorflow/python/eager/function.py", line 1360, in _get_concrete_function_internal_garbage_collected graphfunction, , _ = self._maybe_define_function(args, kwargs) File "/home/shajunqin/.local/lib/python3.6/site-packages/tensorflow/python/eager/function.py", line 1648, in _maybe_define_function graph_function = self._create_graph_function(args, kwargs) File "/home/shajunqin/.local/lib/python3.6/site-packages/tensorflow/python/eager/function.py", line 1541, in _create_graph_function capture_by_value=self._capture_by_value), File "/home/shajunqin/.local/lib/python3.6/site-packages/tensorflow/python/framework/func_graph.py", line 716, in func_graph_from_py_func func_outputs = python_func(func_args, func_kwargs) File "/home/shajunqin/.local/lib/python3.6/site-packages/tensorflow/python/eager/def_function.py", line 309, in wrapped_fn return weak_wrapped_fn().wrapped(*args, *kwds) File "/home/shajunqin/.local/lib/python3.6/site-packages/tensorflow/python/eager/function.py", line 2155, in bound_method_wrapper return wrapped_fn(args, **kwargs) File "/home/shajunqin/.local/lib/python3.6/site-packages/tensorflow/python/framework/func_graph.py", line 706, in wrapper raise e.ag_error_metadata.to_exception(type(e)) RuntimeError: in converted code: relative to /home/shajunqin:

face/face_landmark-master/lib/core/model/shufflenet/simpleface.py:51 call  *
    x1, x2, x3 = self.backbone(inputs, training=training)
.local/lib/python3.6/site-packages/tensorflow/python/keras/engine/base_layer.py:667 __call__
    outputs = call_fn(inputs, *args, **kwargs)
face/face_landmark-master/lib/core/model/shufflenet/shufflenet.py:197 call  *
    x=self.first_conv(inputs,training=training)
.local/lib/python3.6/site-packages/tensorflow/python/keras/engine/base_layer.py:667 __call__
    outputs = call_fn(inputs, *args, **kwargs)
.local/lib/python3.6/site-packages/tensorflow/python/keras/engine/sequential.py:262 call
    outputs = layer(inputs, **kwargs)
.local/lib/python3.6/site-packages/tensorflow/python/keras/engine/base_layer.py:667 __call__
    outputs = call_fn(inputs, *args, **kwargs)
.local/lib/python3.6/site-packages/tensorflow/python/keras/layers/normalization.py:651 call
    outputs = self._fused_batch_norm(inputs, training=training)
.local/lib/python3.6/site-packages/tensorflow/python/keras/layers/normalization.py:533 _fused_batch_norm
    self.add_update(mean_update)
.local/lib/python3.6/site-packages/tensorflow/python/util/deprecation.py:507 new_func
    return func(*args, **kwargs)
.local/lib/python3.6/site-packages/tensorflow/python/keras/engine/base_layer.py:1083 add_update
    '`add_update` was called in a cross-replica context. This is not '

RuntimeError: `add_update` was called in a cross-replica context. This is not expected. If you require this feature, please file an issue.
610265158 commented 4 years ago

用的哪个分之呢,在单卡上跑下试试