PaddlePaddle / PaddleDetection

Object Detection toolkit based on PaddlePaddle. It supports object detection, instance segmentation, multiple object tracking and real-time multi-person keypoint detection.
Apache License 2.0
12.85k stars 2.9k forks source link

请教下大佬为什么batch改为2,faster_rcnn_r50_1x的训练就没成功,1的时候可以,谢谢 #1008

Closed liu-zhi97 closed 4 years ago

liu-zhi97 commented 4 years ago

这是配置 faster_reader.yml TrainReader: inputs_def: fields: ['image', 'im_info', 'im_id', 'gt_bbox', 'gt_class', 'is_crowd'] dataset: !COCODataSet image_dir: train2017 anno_path: annotations/instances_train2017.json dataset_dir: /home/liu/Downloads/PaddleDetection-release-0.3/dataset/coco sample_transforms:

EvalReader: inputs_def: fields: ['image', 'im_info', 'im_id', 'im_shape']

for voc

#fields: ['image', 'im_info', 'im_id', 'im_shape', 'gt_bbox', 'gt_class', 'is_difficult']

dataset: !COCODataSet image_dir: val2017 anno_path: annotations/instances_val2017.json dataset_dir: dataset/coco sample_transforms:

TestReader: inputs_def: fields: ['image', 'im_info', 'im_id', 'im_shape'] dataset: !ImageFolder anno_path: annotations/instances_val2017.json sample_transforms:

faster_rcnn_r50_1x.yml architecture: FasterRCNN use_gpu: true max_iters: 180000 log_smooth_window: 20 save_dir: output snapshot_iter: 10000 pretrain_weights: /home/liu/Downloads/PaddleDetection-release-0.3/weight/ResNet50_cos_pretrained metric: COCO weights: output/faster_rcnn_r50_1x/model_final num_classes: 81

FasterRCNN: backbone: ResNet rpn_head: RPNHead roi_extractor: RoIAlign bbox_head: BBoxHead bbox_assigner: BBoxAssigner

ResNet: norm_type: affine_channel depth: 50 feature_maps: 4 freeze_at: 2

ResNetC5: depth: 50 norm_type: affine_channel

RPNHead: anchor_generator: anchor_sizes: [32, 64, 128, 256, 512] aspect_ratios: [0.5, 1.0, 2.0] stride: [16.0, 16.0] variance: [1.0, 1.0, 1.0, 1.0] rpn_target_assign: rpn_batch_size_per_im: 256 rpn_fg_fraction: 0.5 rpn_negative_overlap: 0.3 rpn_positive_overlap: 0.7 rpn_straddle_thresh: 0.0 use_random: true train_proposal: min_size: 0.0 nms_thresh: 0.7 pre_nms_top_n: 12000 post_nms_top_n: 2000 test_proposal: min_size: 0.0 nms_thresh: 0.7 pre_nms_top_n: 6000 post_nms_top_n: 1000

RoIAlign: resolution: 14 sampling_ratio: 0 spatial_scale: 0.0625

BBoxAssigner: batch_size_per_im: 512 bbox_reg_weights: [0.1, 0.1, 0.2, 0.2] bg_thresh_hi: 0.5 bg_thresh_lo: 0.0 fg_fraction: 0.25 fg_thresh: 0.5

BBoxHead: head: ResNetC5 nms: keep_top_k: 100 nms_threshold: 0.5 score_threshold: 0.05

LearningRate: base_lr: 0.0025 schedulers:

OptimizerBuilder: optimizer: momentum: 0.9 type: Momentum regularizer: factor: 0.0001 type: L2

READER: 'faster_reader.yml'

这是报错 2020-07-02 16:08:51,766-INFO: If regularizer of a Parameter has been set by 'fluid.ParamAttr' or 'fluid.WeightNormParamAttr' already. The Regularization[L2Decay, regularization_coeff=0.000100] in Optimizer will not take effect, and it will only be applied to other Parameters! W0702 16:08:52.349746 6342 device_context.cc:252] Please NOTE: device: 0, CUDA Capability: 61, Driver API Version: 10.1, Runtime API Version: 10.0 W0702 16:08:52.351637 6342 device_context.cc:260] device: 0, cuDNN Version: 7.6. 2020-07-02 16:08:53,186-WARNING: /home/liu/Downloads/PaddleDetection-release-0.3/weight/ResNet50_cos_pretrained.pdparams not found, try to load model file saved with [ save_params, save_persistables, save_vars ] 2020-07-02 16:08:53,820-WARNING: /home/liu/Downloads/PaddleDetection-release-0.3/weight/ResNet50_cos_pretrained.pdparams not found, try to load model file saved with [ save_params, save_persistables, save_vars ] 2020-07-02 16:08:53,874-WARNING: variable file [ /home/liu/Downloads/PaddleDetection-release-0.3/weight/ResNet50_cos_pretrained/fc_0.w_0 /home/liu/Downloads/PaddleDetection-release-0.3/weight/ResNet50_cos_pretrained/fc_0.b_0 ] not used loading annotations into memory... Done (t=8.54s) creating index... index created! 2020-07-02 16:09:07,561-WARNING: Found an invalid bbox in annotations: im_id: 200365, area: 0.0 x1: 296.65, y1: 388.33, x2: 296.67999999999995, y2: 388.33. 2020-07-02 16:09:21,048-WARNING: Found an invalid bbox in annotations: im_id: 550395, area: 0.0 x1: 9.98, y1: 188.56, x2: 14.52, y2: 188.56. 2020-07-02 16:09:32,366-INFO: places would be ommited when DataLoader is not iterable 2020-07-02 16:13:26,959-WARNING: Your reader has raised an exception! Exception in thread Thread-10: Traceback (most recent call last): File "/home/liu/.conda/envs/paddle/lib/python3.6/threading.py", line 916, in _bootstrap_inner self.run() File "/home/liu/.conda/envs/paddle/lib/python3.6/threading.py", line 864, in run self._target(*self._args, *self._kwargs) File "/home/liu/.conda/envs/paddle/lib/python3.6/site-packages/paddle/fluid/reader.py", line 1157, in __thread_main__ six.reraise(sys.exc_info()) File "/home/liu/.conda/envs/paddle/lib/python3.6/site-packages/six.py", line 703, in reraise raise value File "/home/liu/.conda/envs/paddle/lib/python3.6/site-packages/paddle/fluid/reader.py", line 1137, in thread_main for tensors in self._tensor_reader(): File "/home/liu/.conda/envs/paddle/lib/python3.6/site-packages/paddle/fluid/reader.py", line 1207, in tensor_reader_impl for slots in paddle_reader(): File "/home/liu/.conda/envs/paddle/lib/python3.6/site-packages/paddle/fluid/data_feeder.py", line 507, in reader_creator yield self.feed(item) File "/home/liu/.conda/envs/paddle/lib/python3.6/site-packages/paddle/fluid/data_feeder.py", line 348, in feed ret_dict[each_name] = each_converter.done() File "/home/liu/.conda/envs/paddle/lib/python3.6/site-packages/paddle/fluid/data_feeder.py", line 157, in done arr = np.array(self.data, dtype=self.dtype) ValueError: could not broadcast input array from shape (3,800,1199) into shape (3,800)

/home/liu/.conda/envs/paddle/lib/python3.6/site-packages/paddle/fluid/executor.py:1070: UserWarning: The following exception is not an EOF exception. "The following exception is not an EOF exception.") Traceback (most recent call last): File "/home/liu/pycharm-2018.3.4/helpers/pydev/pydevd.py", line 1741, in main() File "/home/liu/pycharm-2018.3.4/helpers/pydev/pydevd.py", line 1735, in main globals = debugger.run(setup['file'], None, None, is_module) File "/home/liu/pycharm-2018.3.4/helpers/pydev/pydevd.py", line 1135, in run pydev_imports.execfile(file, globals, locals) # execute the script File "/home/liu/pycharm-2018.3.4/helpers/pydev/_pydev_imps/_pydev_execfile.py", line 18, in execfile exec(compile(contents+"\n", file, 'exec'), glob, loc) File "/home/liu/Downloads/PaddleDetection-release-0.3/tools/train.py", line 369, in main() File "/home/liu/Downloads/PaddleDetection-release-0.3/tools/train.py", line 241, in main outs = exe.run(compiled_train_prog, fetch_list=train_values) File "/home/liu/.conda/envs/paddle/lib/python3.6/site-packages/paddle/fluid/executor.py", line 1071, in run six.reraise(*sys.exc_info()) File "/home/liu/.conda/envs/paddle/lib/python3.6/site-packages/six.py", line 703, in reraise raise value File "/home/liu/.conda/envs/paddle/lib/python3.6/site-packages/paddle/fluid/executor.py", line 1066, in run return_merged=return_merged) File "/home/liu/.conda/envs/paddle/lib/python3.6/site-packages/paddle/fluid/executor.py", line 1167, in _run_impl return_merged=return_merged) File "/home/liu/.conda/envs/paddle/lib/python3.6/site-packages/paddle/fluid/executor.py", line 879, in _run_parallel tensors = exe.run(fetch_var_names, return_merged)._move_to_list() paddle.fluid.core_avx.EnforceNotMet:


C++ Call Stacks (More useful to developers):

0 std::string paddle::platform::GetTraceBackString<std::string const&>(std::string const&, char const, int) 1 paddle::platform::EnforceNotMet::EnforceNotMet(std::string const&, char const, int) 2 paddle::operators::reader::BlockingQueue<std::vector<paddle::framework::LoDTensor, std::allocator > >::Receive(std::vector<paddle::framework::LoDTensor, std::allocator >) 3 paddle::operators::reader::PyReader::ReadNext(std::vector<paddle::framework::LoDTensor, std::allocator >) 4 std::_Function_handler<std::unique_ptr<std::future_base::_Result_base, std::future_base::_Result_base::_Deleter> (), std::future_base::_Task_setter<std::unique_ptr<std::future_base::_Result, std::future_base::_Result_base::_Deleter>, unsigned long> >::_M_invoke(std::_Any_data const&) 5 std::__future_base::_State_base::_M_do_set(std::function<std::unique_ptr<std::future_base::_Result_base, std::__future_base::_Result_base::_Deleter> ()>&, bool&) 6 ThreadPool::ThreadPool(unsigned long)::{lambda()#1}::operator()() const


Python Call Stacks (More useful to users):

File "/home/liu/.conda/envs/paddle/lib/python3.6/site-packages/paddle/fluid/framework.py", line 2610, in append_op attrs=kwargs.get("attrs", None)) File "/home/liu/.conda/envs/paddle/lib/python3.6/site-packages/paddle/fluid/reader.py", line 1079, in _init_non_iterable attrs={'drop_last': self._drop_last}) File "/home/liu/.conda/envs/paddle/lib/python3.6/site-packages/paddle/fluid/reader.py", line 977, in init self._init_non_iterable() File "/home/liu/.conda/envs/paddle/lib/python3.6/site-packages/paddle/fluid/reader.py", line 608, in from_generator iterable, return_list, drop_last) File "/home/liu/Downloads/PaddleDetection-release-0.3/ppdet/modeling/architectures/faster_rcnn.py", line 236, in build_inputs iterable=iterable) if use_dataloader else None File "/home/liu/Downloads/PaddleDetection-release-0.3/tools/train.py", line 113, in main feed_vars, train_loader = model.build_inputs(**inputs_def) File "/home/liu/Downloads/PaddleDetection-release-0.3/tools/train.py", line 369, in main() File "/home/liu/pycharm-2018.3.4/helpers/pydev/_pydev_imps/_pydev_execfile.py", line 18, in execfile exec(compile(contents+"\n", file, 'exec'), glob, loc) File "/home/liu/pycharm-2018.3.4/helpers/pydev/pydevd.py", line 1135, in run pydev_imports.execfile(file, globals, locals) # execute the script File "/home/liu/pycharm-2018.3.4/helpers/pydev/pydevd.py", line 1735, in main globals = debugger.run(setup['file'], None, None, is_module) File "/home/liu/pycharm-2018.3.4/helpers/pydev/pydevd.py", line 1741, in main()


Error Message Summary:

Error: Blocking queue is killed because the data reader raises an exception [Hint: Expected killed != true, but received killed:1 == true:1.] at (/paddle/paddle/fluid/operators/reader/blocking_queue.h:141) [operator < read > error]

jerrywgz commented 4 years ago

master分支已修复该问题

liu-zhi97 commented 4 years ago

master分支已修复该问题

感谢,master分支是没这个问题