DeepLearnPhysics / faster-rcnn

8 stars 3 forks source link

training on my own data #2

Open FasterChen opened 6 years ago

FasterChen commented 6 years ago

hello i am using this faster-rcnn base for my project i am trying training the model on a database with 1 class but even after i changed the number and names of the classes it still doesn't work the changes i made are in the files: lib/datasets/mias.py lib/rcnn_train/miasdata.py * i am using python 3 i changed all the files to python 3 but the pascal version works for me perfectly so i dont think it is connected to the python 2 to python 3 upgrade

i get 2 types of errors this one: Traceback (most recent call last): File "example/train_mias.py", line 29, in train_io = miasdata_gen(keyword=mias_keyword,cfg=net._cfg) File "/home/g3chen6sh9_gmail_com/Documents/faster-rcnn/lib/rcnntrain/mia sdata.py", line 27, in init ,self.roidb = combined_roidb(keyword) File "/home/g3chen6sh9_gmail_com/Documents/faster-rcnn/lib/datasets/api.p y", line 38, in combined_roidb roidbs = [get_roidb(s) for s in imdb_names.split('+')] File "/home/g3chen6sh9_gmail_com/Documents/faster-rcnn/lib/datasets/api.p y", line 38, in roidbs = [get_roidb(s) for s in imdb_names.split('+')] File "/home/g3chen6sh9_gmail_com/Documents/faster-rcnn/lib/datasets/api.p y", line 35, in get_roidb roidb = get_training_roidb(imdb) File "/home/g3chen6sh9_gmail_com/Documents/faster-rcnn/lib/datasets/api.p y", line 18, in get_training_roidb rdl_roidb.prepare_roidb(imdb) File "/home/g3chen6sh9_gmail_com/Documents/faster-rcnn/lib/datasets/roidb .py", line 51, in prepare_roidb assert all(max_classes[nonzero_inds] != 0) AssertionError

or and more complicated this:

/home/g3chen6sh9_gmail_com/Documents/faster-rcnn/lib/rcnn_layers/proposal _target_layer.py(153)_sample_rois() -> keep_inds = np.append(fg_inds, bg_inds) (Pdb) c 2018-04-15 20:40:20.630765: W tensorflow/core/framework/op_kernel.cc:1192] Invalid argument: Incompatible shapes: [0,12] vs. [128,12] [[Node: gradients/LOSS/mul_9_grad/mul_1 = Mul[T=DT_FLOAT, _device= "/job:localhost/replica:0/task:0/device:GPU:0"](vgg_16_1/proposal_target_la yer_2d/proposal_target/_349, gradients/LOSS/Sum_1_grad/Tile)]] Traceback (most recent call last): File "/home/g3chen6sh9_gmail_com/.local/lib/python3.5/site-packages/tensorflow/python/client/session.py", line 1326, in _do_call return fn(*args) File "/home/g3chen6sh9_gmail_com/.local/lib/python3.5/site-packages/tensorflow/python/client/session.py", line 1303, in _run_fn status, run_metadata) File "/home/g3chen6sh9_gmail_com/.local/lib/python3.5/site-packages/tensorflow/python/framework/errors_impl.py", line 474, in exit c_api.TF_GetCode(self.status.status)) tensorflow.python.framework.errors_impl.InvalidArgumentError: Incompatible shapes: [0,12] vs. [128,12] [[Node: gradients/LOSS/mul_9_grad/mul_1 = Mul[T=DT_FLOAT, _device="/job:localhost/replica:0/task:0/device:GPU:0"](vgg_16_1/proposal_target_layer_2d/proposal_target/_349, gradients/LOSS/Sum_1_grad/Tile)]] [[Node: LOSS/add_5/_365 = _Recv[client_terminated=false, recv_device="/job:localhost/replica:0/task:0/device:CPU:0", send_device="/job:localhost/replica:0/task:0/device:GPU:0", send_device_incarnation=1, tensor_name="edge_1970_LOSS/add_5", tensor_type=DT_FLOAT, _device="/job:localhost/replica:0/task:0/device:CPU:0"]()]]

During handling of the above exception, another exception occurred:

Traceback (most recent call last): File "example/train_mias.py", line 32, in train_net(net, 'output/mias','tensorboard/mias', train_io, val_io, '%s/data/vgg16.ckpt' % os.environ['RCNNDIR'],int(num_iter)) File "/home/g3chen6sh9_gmail_com/Documents/faster-rcnn/lib/rcnn_train/trainer.py", line 289, in train_net sw.train_model(sess, max_iters) File "/home/g3chen6sh9_gmail_com/Documents/faster-rcnn/lib/rcnn_train/trainer.py", line 246, in train_model self.net.train_step(sess, blobs, train_op) File "/home/g3chen6sh9_gmail_com/Documents/faster-rcnn/lib/faster_rcnn.py", line 504, in train_step feed_dict=feed_dict) File "/home/g3chen6sh9_gmail_com/.local/lib/python3.5/site-packages/tensorflow/python/client/session.py", line 889, in run run_metadata_ptr) File "/home/g3chen6sh9_gmail_com/.local/lib/python3.5/site-packages/tensorflow/python/client/session.py", line 1120, in _run feed_dict_tensor, options, run_metadata) File "/home/g3chen6sh9_gmail_com/.local/lib/python3.5/site-packages/tensorflow/python/client/session.py", line 1319, in _do_run options, run_metadata) File "/home/g3chen6sh9_gmail_com/.local/lib/python3.5/site-packages/tensorflow/python/client/session.py", line 1339, in _do_call raise type(e)(node_def, op, message) tensorflow.python.framework.errors_impl.InvalidArgumentError: Incompatible shapes: [0,12] vs. [128,12] [[Node: gradients/LOSS/mul_9_grad/mul_1 = Mul[T=DT_FLOAT, _device="/job:localhost/replica:0/task:0/device:GPU:0"](vgg_16_1/proposal_target_layer_2d/proposal_target/_349, gradients/LOSS/Sum_1_grad/Tile)]] [[Node: LOSS/add_5/_365 = _Recv[client_terminated=false, recv_device="/job:localhost/replica:0/task:0/device:CPU:0", send_device="/job:localhost/replica:0/task:0/device:GPU:0", send_device_incarnation=1, tensor_name="edge_1970_LOSS/add_5", tensor_type=DT_FLOAT, _device="/job:localhost/replica:0/task:0/device:CPU:0"]()]]

Caused by op 'gradients/LOSS/mul_9_grad/mul_1', defined at: File "example/train_mias.py", line 32, in train_net(net, 'output/mias','tensorboard/mias', train_io, val_io, '%s/data/vgg16.ckpt' % os.environ['RCNNDIR'],int(num_iter)) File "/home/g3chen6sh9_gmail_com/Documents/faster-rcnn/lib/rcnn_train/trainer.py", line 289, in train_net sw.train_model(sess, max_iters) File "/home/g3chen6sh9_gmail_com/Documents/faster-rcnn/lib/rcnn_train/trainer.py", line 192, in train_model lr, train_op = self.construct_graph(sess) File "/home/g3chen6sh9_gmail_com/Documents/faster-rcnn/lib/rcnn_train/trainer.py", line 92, in construct_graph gvs = self.optimizer.compute_gradients(loss) File "/home/g3chen6sh9_gmail_com/.local/lib/python3.5/site-packages/tensorflow/python/training/optimizer.py", line 414, in compute_gradients colocate_gradients_with_ops=colocate_gradients_with_ops) File "/home/g3chen6sh9_gmail_com/.local/lib/python3.5/site-packages/tensorflow/python/ops/gradients_impl.py", line 581, in gradients grad_scope, op, func_call, lambda: grad_fn(op, out_grads)) File "/home/g3chen6sh9_gmail_com/.local/lib/python3.5/site-packages/tensorflow/python/ops/gradients_impl.py", line 353, in _MaybeCompile return grad_fn() # Exit early File "/home/g3chen6sh9_gmail_com/.local/lib/python3.5/site-packages/tensorflow/python/ops/gradients_impl.py", line 581, in grad_scope, op, func_call, lambda: grad_fn(op, out_grads)) File "/home/g3chen6sh9_gmail_com/.local/lib/python3.5/site-packages/tensorflow/python/ops/math_grad.py", line 747, in _MulGrad array_ops.reshape(math_ops.reduce_sum(x * grad, ry), sy)) File "/home/g3chen6sh9_gmail_com/.local/lib/python3.5/site-packages/tensorflow/python/ops/math_ops.py", line 894, in binary_op_wrapper return func(x, y, name=name) File "/home/g3chen6sh9_gmail_com/.local/lib/python3.5/site-packages/tensorflow/python/ops/math_ops.py", line 1117, in _mul_dispatch return gen_math_ops._mul(x, y, name=name) File "/home/g3chen6sh9_gmail_com/.local/lib/python3.5/site-packages/tensorflow/python/ops/gen_math_ops.py", line 2726, in _mul "Mul", x=x, y=y, name=name) File "/home/g3chen6sh9_gmail_com/.local/lib/python3.5/site-packages/tensorflow/python/framework/op_def_library.py", line 787, in _apply_op_helper op_def=op_def) File "/home/g3chen6sh9_gmail_com/.local/lib/python3.5/site-packages/tensorflow/python/framework/ops.py", line 2956, in create_op op_def=op_def) File "/home/g3chen6sh9_gmail_com/.local/lib/python3.5/site-packages/tensorflow/python/framework/ops.py", line 1470, in init self._traceback = self._graph._extract_stack() # pylint: disable=protected-access

...which was originally created as op 'LOSS/mul_9', defined at: File "example/train_mias.py", line 32, in train_net(net, 'output/mias','tensorboard/mias', train_io, val_io, '%s/data/vgg16.ckpt' % os.environ['RCNNDIR'],int(num_iter)) [elided 1 identical lines from previous traceback] File "/home/g3chen6sh9_gmail_com/Documents/faster-rcnn/lib/rcnn_train/trainer.py", line 192, in train_model lr, train_op = self.construct_graph(sess) File "/home/g3chen6sh9_gmail_com/Documents/faster-rcnn/lib/rcnn_train/trainer.py", line 84, in construct_graph anchor_ratios=cfg.ANCHOR_RATIOS) File "/home/g3chen6sh9_gmail_com/Documents/faster-rcnn/lib/faster_rcnn.py", line 139, in create_architecture self._add_losses() File "/home/g3chen6sh9_gmail_com/Documents/faster-rcnn/lib/faster_rcnn.py", line 324, in _add_losses loss_box = self._smooth_l1_loss(bbox_pred, bbox_targets, bbox_inside_weights, bbox_outside_weights) File "/home/g3chen6sh9_gmail_com/Documents/faster-rcnn/lib/faster_rcnn.py", line 351, in _smooth_l1_loss out_loss_box = bbox_outside_weights * in_loss_box File "/home/g3chen6sh9_gmail_com/.local/lib/python3.5/site-packages/tensorflow/python/ops/math_ops.py", line 894, in binary_op_wrapper return func(x, y, name=name) File "/home/g3chen6sh9_gmail_com/.local/lib/python3.5/site-packages/tensorflow/python/ops/math_ops.py", line 1117, in _mul_dispatch return gen_math_ops._mul(x, y, name=name) File "/home/g3chen6sh9_gmail_com/.local/lib/python3.5/site-packages/tensorflow/python/ops/gen_math_ops.py", line 2726, in _mul "Mul", x=x, y=y, name=name) File "/home/g3chen6sh9_gmail_com/.local/lib/python3.5/site-packages/tensorflow/python/framework/op_def_library.py", line 787, in _apply_op_helper op_def=op_def) File "/home/g3chen6sh9_gmail_com/.local/lib/python3.5/site-packages/tensorflow/python/framework/ops.py", line 2956, in create_op op_def=op_def) File "/home/g3chen6sh9_gmail_com/.local/lib/python3.5/site-packages/tensorflow/python/framework/ops.py", line 1470, in init self._traceback = self._graph._extract_stack() # pylint: disable=protected-access

InvalidArgumentError (see above for traceback): Incompatible shapes: [0,12] vs. [128,12] [[Node: gradients/LOSS/mul_9_grad/mul_1 = Mul[T=DT_FLOAT, _device="/job:localhost/replica:0/task:0/device:GPU:0"](vgg_16_1/proposal_target_layer_2d/proposal_target/_349, gradients/LOSS/Sum_1_grad/Tile)]] [[Node: LOSS/add_5/_365 = _Recv[client_terminated=false, recv_device="/job:localhost/replica:0/task:0/device:CPU:0", send_device="/job:localhost/replica:0/task:0/device:GPU:0", send_device_incarnation=1, tensor_name="edge_1970_LOSS/add_5", tensor_type=DT_FLOAT, _device="/job:localhost/repli ca:0/task:0/device:CPU:0"]()]]

i trace the problem to the the file lib/rcnn_layers/proposal _target_layer.py that when fg and bg are 0 i get this big error

BTW i tried 5 different faster rcnn posts in github to help me in my project and this one was the only one that worked for me so thank you DeepLearnPhysics

yhs95 commented 6 years ago

Hello, when I used my own data training, I also reported the following error. How do you solve it?

Invalid argument: Incompatible shapes: [0,12] vs. [128,12] [[Node: gradients/LOSS/mul_9_grad/mul_1 = Mul[T=DT_FLOAT, _device= "/job:localhost/replica:0/task:0/device:GPU:0"](vgg_16_1/proposal_target_la yer_2d/proposal_target/_349, gradients/LOSS/Sum_1_grad/Tile)]]