thtrieu / darkflow

Translate darknet to tensorflow. Load trained weights, retrain/fine-tune using tensorflow, export constant graph def to mobile devices
GNU General Public License v3.0
6.13k stars 2.08k forks source link

Need Help!!! I cant use GPU for training this image #1135

Open ArieSiregar opened 4 years ago

ArieSiregar commented 4 years ago

(TA) D:\Kuliah\Tutorial YOLO\darkflow-master>python flow --model cfg\yolololo.cfg --load bin\yolo.weights --train --dataset test\dataset\images --annotation test\dataset\annotations --gpu 0.8 WARNING:tensorflow:From D:\Kuliah\Tutorial YOLO\darkflow-master\darkflow\net\build.py:15: The name tf.train.RMSPropOptimizer is deprecated. Please use tf.compat.v1.train.RMSPropOptimizer instead.

WARNING:tensorflow:From D:\Kuliah\Tutorial YOLO\darkflow-master\darkflow\net\build.py:16: The name tf.train.AdadeltaOptimizer is deprecated. Please use tf.compat.v1.train.AdadeltaOptimizer instead.

WARNING:tensorflow:From D:\Kuliah\Tutorial YOLO\darkflow-master\darkflow\net\build.py:17: The name tf.train.AdagradOptimizer is deprecated. Please use tf.compat.v1.train.AdagradOptimizer instead.

WARNING:tensorflow:From D:\Kuliah\Tutorial YOLO\darkflow-master\darkflow\net\build.py:18: The name tf.train.AdagradDAOptimizer is deprecated. Please use tf.compat.v1.train.AdagradDAOptimizer instead.

WARNING:tensorflow:From D:\Kuliah\Tutorial YOLO\darkflow-master\darkflow\net\build.py:19: The name tf.train.MomentumOptimizer is deprecated. Please use tf.compat.v1.train.MomentumOptimizer instead.

Parsing ./cfg/yolo.cfg Parsing cfg\yolololo.cfg Loading bin\yolo.weights ... Successfully identified 203934260 bytes Finished in 0.10496234893798828s

Building net ... WARNING:tensorflow:From D:\Kuliah\Tutorial YOLO\darkflow-master\darkflow\net\build.py:105: The name tf.placeholder is deprecated. Please use tf.compat.v1.placeholder instead.

Source | Train? | Layer description | Output size -------+--------+----------------------------------+--------------- WARNING:tensorflow:From D:\Kuliah\Tutorial YOLO\darkflow-master\darkflow\net\ops\baseop.py:70: The name tf.variable_scope is deprecated. Please use tf.compat.v1.variable_scope instead.

WARNING:tensorflow:From D:\Kuliah\Tutorial YOLO\darkflow-master\darkflow\net\ops\baseop.py:71: The name tf.get_variable is deprecated. Please use tf.compat.v1.get_variable instead.

WARNING:tensorflow:From D:\Kuliah\Tutorial YOLO\darkflow-master\darkflow\net\ops\baseop.py:84: The name tf.placeholder_with_default is deprecated. Please use tf.compat.v1.placeholder_with_default instead.

   |        | input                            | (?, 256, 256, 3)

Load | Yep! | conv 3x3p1_1 +bnorm leaky | (?, 256, 256, 32) WARNING:tensorflow:From D:\Kuliah\Tutorial YOLO\darkflow-master\darkflow\net\ops\simple.py:106: The name tf.nn.max_pool is deprecated. Please use tf.nn.max_pool2d instead.

Load | Yep! | maxp 2x2p0_2 | (?, 128, 128, 32) Load | Yep! | conv 3x3p1_1 +bnorm leaky | (?, 128, 128, 64) Load | Yep! | maxp 2x2p0_2 | (?, 64, 64, 64) Load | Yep! | conv 3x3p1_1 +bnorm leaky | (?, 64, 64, 128) Load | Yep! | conv 1x1p0_1 +bnorm leaky | (?, 64, 64, 64) Load | Yep! | conv 3x3p1_1 +bnorm leaky | (?, 64, 64, 128) Load | Yep! | maxp 2x2p0_2 | (?, 32, 32, 128) Load | Yep! | conv 3x3p1_1 +bnorm leaky | (?, 32, 32, 256) Load | Yep! | conv 1x1p0_1 +bnorm leaky | (?, 32, 32, 128) Load | Yep! | conv 3x3p1_1 +bnorm leaky | (?, 32, 32, 256) Load | Yep! | maxp 2x2p0_2 | (?, 16, 16, 256) Load | Yep! | conv 3x3p1_1 +bnorm leaky | (?, 16, 16, 512) Load | Yep! | conv 1x1p0_1 +bnorm leaky | (?, 16, 16, 256) Load | Yep! | conv 3x3p1_1 +bnorm leaky | (?, 16, 16, 512) Load | Yep! | conv 1x1p0_1 +bnorm leaky | (?, 16, 16, 256) Load | Yep! | conv 3x3p1_1 +bnorm leaky | (?, 16, 16, 512) Load | Yep! | maxp 2x2p0_2 | (?, 8, 8, 512) Load | Yep! | conv 3x3p1_1 +bnorm leaky | (?, 8, 8, 1024) Load | Yep! | conv 1x1p0_1 +bnorm leaky | (?, 8, 8, 512) Load | Yep! | conv 3x3p1_1 +bnorm leaky | (?, 8, 8, 1024) Load | Yep! | conv 1x1p0_1 +bnorm leaky | (?, 8, 8, 512) Load | Yep! | conv 3x3p1_1 +bnorm leaky | (?, 8, 8, 1024) Load | Yep! | conv 3x3p1_1 +bnorm leaky | (?, 8, 8, 1024) Load | Yep! | conv 3x3p1_1 +bnorm leaky | (?, 8, 8, 1024) Load | Yep! | concat [16] | (?, 16, 16, 512) Load | Yep! | conv 1x1p0_1 +bnorm leaky | (?, 16, 16, 64) WARNING:tensorflow:From D:\Kuliah\Tutorial YOLO\darkflow-master\darkflow\net\ops\convolution.py:28: calling extract_image_patches (from tensorflow.python.ops.array_ops) with ksizes is deprecated and will be removed in a future version. Instructions for updating: ksizes is deprecated, use sizes instead Load | Yep! | local flatten 2x2 | (?, 8, 8, 256) Load | Yep! | concat [27, 24] | (?, 8, 8, 1280) Load | Yep! | conv 3x3p1_1 +bnorm leaky | (?, 8, 8, 1024) Init | Yep! | conv 1x1p0_1 linear | (?, 8, 8, 35) -------+--------+----------------------------------+--------------- GPU mode with 0.8 usage WARNING:tensorflow:From D:\Kuliah\Tutorial YOLO\darkflow-master\darkflow\net\build.py:132: The name tf.GPUOptions is deprecated. Please use tf.compat.v1.GPUOptions instead.

cfg\yolololo.cfg loss hyper-parameters: H = 8 W = 8 box = 5 classes = 2 scales = [1.0, 5.0, 1.0, 1.0] WARNING:tensorflow:From D:\Kuliah\Tutorial YOLO\darkflow-master\darkflow\net\yolov2\train.py:87: to_float (from tensorflow.python.ops.math_ops) is deprecated and will be removed in a future version. Instructions for updating: Use tf.cast instead. Building cfg\yolololo.cfg loss WARNING:tensorflow:From D:\Kuliah\Tutorial YOLO\darkflow-master\darkflow\net\yolov2\train.py:107: The name tf.summary.scalar is deprecated. Please use tf.compat.v1.summary.scalar instead.

Building cfg\yolololo.cfg train op WARNING:tensorflow:From D:\Program\Aplikasi\Anaconda3\envs\TA\lib\site-packages\tensorflow\python\ops\math_grad.py:1205: add_dispatch_support..wrapper (from tensorflow.python.ops.array_ops) is deprecated and will be removed in a future version. Instructions for updating: Use tf.where in 2.0, which has the same broadcast rule as np.where WARNING:tensorflow:From D:\Program\Aplikasi\Anaconda3\envs\TA\lib\site-packages\tensorflow\python\training\rmsprop.py:119: calling Ones.init (from tensorflow.python.ops.init_ops) with dtype is deprecated and will be removed in a future version. Instructions for updating: Call initializer instance with the dtype argument instead of passing it to the constructor 2020-02-02 19:40:03.261935: I tensorflow/core/platform/cpu_feature_guard.cc:142] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX AVX2 2020-02-02 19:40:03.296697: I tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully opened dynamic library nvcuda.dll 2020-02-02 19:40:04.358576: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1640] Found device 0 with properties: name: GeForce GTX 1050 Ti major: 6 minor: 1 memoryClockRate(GHz): 1.62 pciBusID: 0000:01:00.0 2020-02-02 19:40:04.383774: I tensorflow/stream_executor/platform/default/dlopen_checker_stub.cc:25] GPU libraries are statically linked, skip dlopen check. 2020-02-02 19:40:04.458356: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1763] Adding visible gpu devices: 0 2020-02-02 19:40:17.320494: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1181] Device interconnect StreamExecutor with strength 1 edge matrix: 2020-02-02 19:40:17.341351: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1187] 0 2020-02-02 19:40:17.350955: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1200] 0: N 2020-02-02 19:40:17.404108: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1326] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 3276 MB memory) -> physical GPU (device: 0, name: GeForce GTX 1050 Ti, pci bus id: 0000:01:00.0, compute capability: 6.1) Finished in 64.64945793151855s

Enter training ...

cfg\yolololo.cfg parsing test\dataset\annotations Parsing for ['car', 'truk'] [====================>]100% 9.xml Statistics: car: 71 truk: 4 Dataset size: 11 Dataset of 11 instance(s) Training statistics: Learning rate : 1e-05 Batch size : 11 Epoch number : 1000 Backup every : 2000 2020-02-02 19:41:04.704891: E tensorflow/stream_executor/cuda/cuda_dnn.cc:329] Could not create cudnn handle: CUDNN_STATUS_ALLOC_FAILED 2020-02-02 19:41:04.738078: E tensorflow/stream_executor/cuda/cuda_dnn.cc:329] Could not create cudnn handle: CUDNN_STATUS_ALLOC_FAILED Traceback (most recent call last): File "D:\Program\Aplikasi\Anaconda3\envs\TA\lib\site-packages\tensorflow\python\client\session.py", line 1356, in _do_call return fn(*args) File "D:\Program\Aplikasi\Anaconda3\envs\TA\lib\site-packages\tensorflow\python\client\session.py", line 1341, in _run_fn options, feed_dict, fetch_list, target_list, run_metadata) File "D:\Program\Aplikasi\Anaconda3\envs\TA\lib\site-packages\tensorflow\python\client\session.py", line 1429, in _call_tf_sessionrun run_metadata) tensorflow.python.framework.errors_impl.UnknownError: 2 root error(s) found. (0) Unknown: Failed to get convolution algorithm. This is probably because cuDNN failed to initialize, so try looking to see if a warning log message was printed above. [[{{node 0-convolutional_2}}]] [[mul_34/_119]] (1) Unknown: Failed to get convolution algorithm. This is probably because cuDNN failed to initialize, so try looking to see if a warning log message was printed above. [[{{node 0-convolutional_2}}]] 0 successful operations. 0 derived errors ignored.

During handling of the above exception, another exception occurred:

Traceback (most recent call last): File "flow", line 6, in cliHandler(sys.argv) File "D:\Kuliah\Tutorial YOLO\darkflow-master\darkflow\cli.py", line 33, in cliHandler print('Enter training ...'); tfnet.train() File "D:\Kuliah\Tutorial YOLO\darkflow-master\darkflow\net\flow.py", line 56, in train fetched = self.sess.run(fetches, feed_dict) File "D:\Program\Aplikasi\Anaconda3\envs\TA\lib\site-packages\tensorflow\python\client\session.py", line 950, in run run_metadata_ptr) File "D:\Program\Aplikasi\Anaconda3\envs\TA\lib\site-packages\tensorflow\python\client\session.py", line 1173, in _run feed_dict_tensor, options, run_metadata) File "D:\Program\Aplikasi\Anaconda3\envs\TA\lib\site-packages\tensorflow\python\client\session.py", line 1350, in _do_run run_metadata) File "D:\Program\Aplikasi\Anaconda3\envs\TA\lib\site-packages\tensorflow\python\client\session.py", line 1370, in _do_call raise type(e)(node_def, op, message) tensorflow.python.framework.errors_impl.UnknownError: 2 root error(s) found. (0) Unknown: Failed to get convolution algorithm. This is probably because cuDNN failed to initialize, so try looking to see if a warning log message was printed above. [[node 0-convolutional_2 (defined at D:\Kuliah\Tutorial YOLO\darkflow-master\darkflow\net\ops\convolution.py:71) ]] [[mul_34/_119]] (1) Unknown: Failed to get convolution algorithm. This is probably because cuDNN failed to initialize, so try looking to see if a warning log message was printed above. [[node 0-convolutional_2 (defined at D:\Kuliah\Tutorial YOLO\darkflow-master\darkflow\net\ops\convolution.py:71) ]] 0 successful operations. 0 derived errors ignored.

Errors may have originated from an input operation. Input Source operations connected to node 0-convolutional_2: 0-convolutional/kernel/read (defined at D:\Kuliah\Tutorial YOLO\darkflow-master\darkflow\net\ops\baseop.py:74) Pad (defined at D:\Kuliah\Tutorial YOLO\darkflow-master\darkflow\net\ops\convolution.py:69)

Input Source operations connected to node 0-convolutional_2: 0-convolutional/kernel/read (defined at D:\Kuliah\Tutorial YOLO\darkflow-master\darkflow\net\ops\baseop.py:74) Pad (defined at D:\Kuliah\Tutorial YOLO\darkflow-master\darkflow\net\ops\convolution.py:69)

Original stack trace for '0-convolutional_2': File "flow", line 6, in cliHandler(sys.argv) File "D:\Kuliah\Tutorial YOLO\darkflow-master\darkflow\cli.py", line 26, in cliHandler tfnet = TFNet(FLAGS) File "D:\Kuliah\Tutorial YOLO\darkflow-master\darkflow\net\build.py", line 75, in init self.build_forward() File "D:\Kuliah\Tutorial YOLO\darkflow-master\darkflow\net\build.py", line 115, in build_forward state = op_create(args) File "D:\Kuliah\Tutorial YOLO\darkflow-master\darkflow\net\ops__init__.py", line 27, in op_create return op_types[layer_type](args) File "D:\Kuliah\Tutorial YOLO\darkflow-master\darkflow\net\ops\baseop.py", line 42, in init self.forward() File "D:\Kuliah\Tutorial YOLO\darkflow-master\darkflow\net\ops\convolution.py", line 71, in forward name = self.scope, strides = [1] + [self.lay.stride] 2 + [1]) File "D:\Program\Aplikasi\Anaconda3\envs\TA\lib\site-packages\tensorflow\python\ops\nn_ops.py", line 1953, in conv2d name=name) File "D:\Program\Aplikasi\Anaconda3\envs\TA\lib\site-packages\tensorflow\python\ops\gen_nn_ops.py", line 1071, in conv2d data_format=data_format, dilations=dilations, name=name) File "D:\Program\Aplikasi\Anaconda3\envs\TA\lib\site-packages\tensorflow\python\framework\op_def_library.py", line 788, in _apply_op_helper op_def=op_def) File "D:\Program\Aplikasi\Anaconda3\envs\TA\lib\site-packages\tensorflow\python\util\deprecation.py", line 507, in new_func return func(args, **kwargs) File "D:\Program\Aplikasi\Anaconda3\envs\TA\lib\site-packages\tensorflow\python\framework\ops.py", line 3616, in create_op op_def=op_def) File "D:\Program\Aplikasi\Anaconda3\envs\TA\lib\site-packages\tensorflow\python\framework\ops.py", line 2005, in init self._traceback = tf_stack.extract_stack()

ArieSiregar commented 4 years ago

I use tensorflow 1.14.0 and GPU Nvidia GTX 1050Ti

beebrain commented 3 years ago

reduce batch --batch 8

fangfap commented 2 years ago

deincrease --gpu 0.7 or less work for me 💯