PaddlePaddle / PaddleDetection

Object Detection toolkit based on PaddlePaddle. It supports object detection, instance segmentation, multiple object tracking and real-time multi-person keypoint detection.
Apache License 2.0
12.75k stars 2.88k forks source link

object365数据集 Invalid segm type: <class 'NoneType'> #543

Closed ash12358 closed 4 years ago

ash12358 commented 4 years ago

object365数据集训练,按照 https://github.com/PaddlePaddle/PaddleDetection/blob/release/0.2/docs/featured_model/CACascadeRCNN.md 要求准备好数据,执行命令为CUDA_VISIBLE_DEVICES=1 python tools/train.py -c configs/obj365/cascade_rcnn_dcnv2_se154_vd_fpn_gn_cas.yml,结果报错。

我自己初步检查发现是因为

ppdet/data/source/coco.py第102行,instances = coco.loadAnns(ins_anno_ids),其中instances长度为0,导致导致下面119行num_bbox = len(bboxes)的num_bbox为0,最终导致138行的 coco_rec = { 'im_file': im_fname, 'im_id': np.array([img_id]), 'h': im_h, 'w': im_w, 'is_crowd': is_crowd, 'gt_class': gt_class, 'gt_bbox': gt_bbox, 'gt_score': gt_score, 'gt_poly': gt_poly, } 中的gt_class,gt_bbox,gt_score全部为None或空集。

举个例子,

比如有张图片obj365_train_000000136386.jpg,它在train.json中images里的记录是{'file_name': 'obj365_train_000000136386.jpg', 'id': 136386, 'width': 640, 'height': 480}, 在annotations里的记录是{'area': 29727.047166903096, 'category_id': 3, 'image_id': 136386, 'id': 2088, 'bbox': [-0.004699712, 263.249328624, 639.927185088, 46.453796400000044], 'iscrowd': 1}, 这里是有一个标注框的,并且是valid的,但是coco.py读取出来的instances却是[],不知道是哪一步读取出问题了。

具体报错日志如下,请大家帮忙看一看,谢谢。

(paddle) w@i:/data/ssh/PaddleDetection$ CUDA_VISIBLE_DEVICES=1 python tools/train.py -c configs/obj365/cascade_rcnn_dcnv2_se154_vd_fpn_gn_cas.yml CascadeBBoxAssigner: [32mbatch_size_per_im[0m: 1024 bbox_reg_weights:

W0423 11:05:14.742431 9435 device_context.cc:237] Please NOTE: device: 0, CUDA Capability: 75, Driver API Version: 10.0, Runtime API Version: 10.0 W0423 11:05:14.748662 9435 device_context.cc:245] device: 0, cuDNN Version: 7.6. 2020-04-23 11:05:18,194-INFO: Found /home/w/.cache/paddle/weights/cascade_mask_rcnn_dcnv2_se154_vd_fpn_gn_coco_pretrained 2020-04-23 11:05:18,195-INFO: Loading parameters from /home/w/.cache/paddle/weights/cascade_mask_rcnn_dcnv2_se154_vd_fpn_gn_coco_pretrained... 2020-04-23 11:05:18,195-WARNING: /home/w/.cache/paddle/weights/cascade_mask_rcnn_dcnv2_se154_vd_fpn_gn_coco_pretrained.pdparams not found, try to load model file saved with [ save_params, save_persistables, save_vars ] 2020-04-23 11:05:18,195-WARNING: /home/w/.cache/paddle/weights/cascade_mask_rcnn_dcnv2_se154_vd_fpn_gn_coco_pretrained.pdparams not found, try to load model file saved with [ save_params, save_persistables, save_vars ] 2020-04-23 11:05:18,298-WARNING: variable file [ /home/w/.cache/paddle/weights/cascade_mask_rcnn_dcnv2_se154_vd_fpn_gn_coco_pretrained/mask_inter_feat_1_weights /home/w/.cache/paddle/weights/cascade_mask_rcnn_dcnv2_se154_vd_fpn_gn_coco_pretrained/mask_inter_feat_2_scale /home/w/.cache/paddle/weights/cascade_mask_rcnn_dcnv2_se154_vd_fpn_gn_coco_pretrained/mask_inter_feat_4_weights /home/w/.cache/paddle/weights/cascade_mask_rcnn_dcnv2_se154_vd_fpn_gn_coco_pretrained/mask_fcn_logits_w /home/w/.cache/paddle/weights/cascade_mask_rcnn_dcnv2_se154_vd_fpn_gn_coco_pretrained/mask_inter_feat_1_offset /home/w/.cache/paddle/weights/cascade_mask_rcnn_dcnv2_se154_vd_fpn_gn_coco_pretrained/mask_inter_feat_2_weights /home/w/.cache/paddle/weights/cascade_mask_rcnn_dcnv2_se154_vd_fpn_gn_coco_pretrained/mask_inter_feat_4_scale /home/w/.cache/paddle/weights/cascade_mask_rcnn_dcnv2_se154_vd_fpn_gn_coco_pretrained/mask_inter_feat_3_offset /home/w/.cache/paddle/weights/cascade_mask_rcnn_dcnv2_se154_vd_fpn_gn_coco_pretrained/mask_inter_feat_2_offset /home/w/.cache/paddle/weights/cascade_mask_rcnn_dcnv2_se154_vd_fpn_gn_coco_pretrained/mask_inter_feat_3_scale /home/w/.cache/paddle/weights/cascade_mask_rcnn_dcnv2_se154_vd_fpn_gn_coco_pretrained/mask_inter_feat_1_scale /home/w/.cache/paddle/weights/cascade_mask_rcnn_dcnv2_se154_vd_fpn_gn_coco_pretrained/mask_inter_feat_4_offset /home/w/.cache/paddle/weights/cascade_mask_rcnn_dcnv2_se154_vd_fpn_gn_coco_pretrained/conv5_mask_b /home/w/.cache/paddle/weights/cascade_mask_rcnn_dcnv2_se154_vd_fpn_gn_coco_pretrained/mask_fcn_logits_b /home/w/.cache/paddle/weights/cascade_mask_rcnn_dcnv2_se154_vd_fpn_gn_coco_pretrained/conv5_mask_w /home/w/.cache/paddle/weights/cascade_mask_rcnn_dcnv2_se154_vd_fpn_gn_coco_pretrained/mask_inter_feat_3_weights ] not used 2020-04-23 11:05:18,298-WARNING: variable file [ /home/w/.cache/paddle/weights/cascade_mask_rcnn_dcnv2_se154_vd_fpn_gn_coco_pretrained/mask_inter_feat_1_weights /home/w/.cache/paddle/weights/cascade_mask_rcnn_dcnv2_se154_vd_fpn_gn_coco_pretrained/mask_inter_feat_2_scale /home/w/.cache/paddle/weights/cascade_mask_rcnn_dcnv2_se154_vd_fpn_gn_coco_pretrained/mask_inter_feat_4_weights /home/w/.cache/paddle/weights/cascade_mask_rcnn_dcnv2_se154_vd_fpn_gn_coco_pretrained/mask_fcn_logits_w /home/w/.cache/paddle/weights/cascade_mask_rcnn_dcnv2_se154_vd_fpn_gn_coco_pretrained/mask_inter_feat_1_offset /home/w/.cache/paddle/weights/cascade_mask_rcnn_dcnv2_se154_vd_fpn_gn_coco_pretrained/mask_inter_feat_2_weights /home/w/.cache/paddle/weights/cascade_mask_rcnn_dcnv2_se154_vd_fpn_gn_coco_pretrained/mask_inter_feat_4_scale /home/w/.cache/paddle/weights/cascade_mask_rcnn_dcnv2_se154_vd_fpn_gn_coco_pretrained/mask_inter_feat_3_offset /home/w/.cache/paddle/weights/cascade_mask_rcnn_dcnv2_se154_vd_fpn_gn_coco_pretrained/mask_inter_feat_2_offset /home/w/.cache/paddle/weights/cascade_mask_rcnn_dcnv2_se154_vd_fpn_gn_coco_pretrained/mask_inter_feat_3_scale /home/w/.cache/paddle/weights/cascade_mask_rcnn_dcnv2_se154_vd_fpn_gn_coco_pretrained/mask_inter_feat_1_scale /home/w/.cache/paddle/weights/cascade_mask_rcnn_dcnv2_se154_vd_fpn_gn_coco_pretrained/mask_inter_feat_4_offset /home/w/.cache/paddle/weights/cascade_mask_rcnn_dcnv2_se154_vd_fpn_gn_coco_pretrained/conv5_mask_b /home/w/.cache/paddle/weights/cascade_mask_rcnn_dcnv2_se154_vd_fpn_gn_coco_pretrained/mask_fcn_logits_b /home/w/.cache/paddle/weights/cascade_mask_rcnn_dcnv2_se154_vd_fpn_gn_coco_pretrained/conv5_mask_w /home/w/.cache/paddle/weights/cascade_mask_rcnn_dcnv2_se154_vd_fpn_gn_coco_pretrained/mask_inter_feat_3_weights ] not used loading annotations into memory... Done (t=112.57s) creating index... index created! 2020-04-23 11:07:50,700-WARNING: Found an invalid bbox in annotations: im_id: 608537, area: 0.0 x1: 269.571228032, y1: 248.73785399999997, x2: 269.651184064, y2: 248.73785399999997. 2020-04-23 11:07:54,409-WARNING: Found an invalid bbox in annotations: im_id: 549169, area: 0.0 x1: 742.3126221, y1: 8.545593250000001, x2: 742.3126221, y2: 10.2289734. 2020-04-23 11:07:54,923-WARNING: Found an invalid bbox in annotations: im_id: 91534, area: 0.0 x1: 433.56384276160003, y1: 420.4645385728, x2: 438.66027832960003, y2: 420.4645385728. 2020-04-23 11:07:55,166-WARNING: Found an invalid bbox in annotations: im_id: 555363, area: 0.0 x1: 337.9105372966, y1: 345.2622006784, x2: 337.9105372966, y2: 348.4704679424. 2020-04-23 11:07:57,040-WARNING: Found an invalid bbox in annotations: im_id: 536966, area: 0.0 x1: 554.6245116984, y1: 299.9646606336, x2: 554.6245116984, y2: 314.069702144. 2020-04-23 11:07:57,997-WARNING: Found an invalid bbox in annotations: im_id: 548210, area: 0.0 x1: 193.9760131584, y1: 273.4229126144, x2: 193.9760131584, y2: 273.5556640768. 2020-04-23 11:07:59,855-WARNING: Found an invalid bbox in annotations: im_id: 580684, area: 0.0 x1: 92.80249024999999, y1: 246.2189331, x2: 92.80249024999999, y2: 251.5336914125. 2020-04-23 11:08:01,507-WARNING: Found an invalid bbox in annotations: im_id: 545995, area: 12.73863515740429 x1: 776.3353271618, y1: 663.263916032, x2: 682.0, y2: 511.0. 2020-04-23 11:08:04,240-WARNING: Found an invalid bbox in annotations: im_id: 691979, area: 7.077892501177361 x1: 840.7648925851, y1: 316.9513549824, x2: 682.0, y2: 318.6118163968. 2020-04-23 11:08:07,744-WARNING: Found an invalid bbox in annotations: im_id: 652823, area: 1.154319683283666 x1: 703.4766845439999, y1: 513.236083968, x2: 639.0, y2: 479.0. 2020-04-23 11:08:11,566-WARNING: Found an invalid bbox in annotations: im_id: 535871, area: 0.0 x1: 444.405273457, y1: 427.4860839936, x2: 444.405273457, y2: 430.7103271424. 2020-04-23 11:08:15,927-WARNING: Found an invalid bbox in annotations: im_id: 359829, area: 46.71179995933256 x1: 613.3458251776, y1: 662.7961425408, x2: 511.0, y2: 671.4617920256001. 2020-04-23 11:08:21,938-WARNING: Found an invalid bbox in annotations: im_id: 533112, area: 0.0 x1: 652.6395257820001, y1: 150.0715779584, x2: 652.6395257820001, y2: 151.5980086272. 2020-04-23 11:08:26,161-WARNING: Found an invalid bbox in annotations: im_id: 280484, area: 4.1640145666077 x1: 683.2222900679001, y1: 426.990234368, x2: 682.0, y2: 428.3464965632. 2020-04-23 11:08:27,337-WARNING: Found an invalid bbox in annotations: im_id: 335604, area: 0.0 x1: 132.0651185036, y1: 490.995319296, x2: 135.152460152, y2: 490.995319296. 2020-04-23 11:09:06,249-WARNING: Found an invalid bbox in annotations: im_id: 545724, area: 3.1565472338744165 x1: 408.56115725259997, y1: 512.0473632768, x2: 411.9129638641, y2: 511.0. 2020-04-23 11:09:08,154-WARNING: Found an invalid bbox in annotations: im_id: 646677, area: 4.7767199789934365 x1: 300.07025145, y1: 390.55169676, x2: 299.0, y2: 391.50653076000003. 2020-04-23 11:09:13,348-WARNING: Found an invalid bbox in annotations: im_id: 574787, area: 0.0 x1: 633.690185536, y1: 370.5393676572, x2: 634.97998048, y2: 370.5393676572. 2020-04-23 11:09:14,792-WARNING: Found an invalid bbox in annotations: im_id: 521927, area: 0.0 x1: 55.575744628100004, y1: 270.0220947456, x2: 57.9324340495, y2: 270.0220947456. 2020-04-23 11:09:17,220-WARNING: Found an invalid bbox in annotations: im_id: 527817, area: 657.4470979715645 x1: 515.3116454912, y1: 674.2288818061, x2: 511.0, y2: 910.0. 2020-04-23 11:09:19,779-WARNING: Found an invalid bbox in annotations: im_id: 145983, area: 0.0 x1: 211.64685056, y1: 203.173278816, x2: 211.64685056, y2: 203.356811504. 2020-04-23 11:09:20,553-WARNING: Found an invalid bbox in annotations: im_id: 599992, area: 50.02852134771837 x1: 718.86511232, y1: 441.29223632120005, x2: 639.0, y2: 435.0. 2020-04-23 11:09:22,498-WARNING: Found an invalid bbox in annotations: im_id: 365040, area: 2.20002733606966 x1: 536.6248779263999, y1: 511.0604858368, x2: 536.7738036992, y2: 511.0. 2020-04-23 11:09:26,303-WARNING: Found an invalid bbox in annotations: im_id: 657457, area: 8.41608131001895 x1: 479.75616456, y1: 0.028564480000000003, x2: 479.0, y2: 531.41967776. 2020-04-23 11:09:28,668-WARNING: Found an invalid bbox in annotations: im_id: 560944, area: 0.0 x1: 281.2375488, y1: 306.20666505599996, x2: 281.417053248, y2: 306.20666505599996. 2020-04-23 11:09:32,803-WARNING: Found an invalid bbox in annotations: im_id: 650302, area: 27.592423196214177 x1: 0, y1: 653.989868159, x2: 4.872924800000002, y2: 477.0. 2020-04-23 11:09:45,668-WARNING: Found an invalid bbox in annotations: im_id: 344510, area: 0.0 x1: 878.1358642939999, y1: 360.5763549696, x2: 771.0, y2: 360.5763549696. 2020-04-23 11:09:48,981-WARNING: Found an invalid bbox in annotations: im_id: 547644, area: 0.0 x1: 432.0398170408, y1: 374.5224859648, x2: 432.0398170408, y2: 377.7821602816. 2020-04-23 11:09:52,356-INFO: 608606 samples in file dataset/objects365/annotations/train.json 2020-04-23 11:11:05,879-INFO: places would be ommited when DataLoader is not iterable I0423 11:11:06.125291 9435 parallel_executor.cc:440] The Program will be executed on CUDA using ParallelExecutor, 1 cards are used, so 1 programs are executed in parallel. I0423 11:11:06.342186 9435 build_strategy.cc:365] SeqOnlyAllReduceOps:0, num_trainers:1 I0423 11:11:08.811236 9435 parallel_executor.cc:307] Inplace strategy is enabled, when build_strategy.enable_inplace = True I0423 11:11:09.054006 9435 parallel_executor.cc:375] Garbage collection strategy is enabled, when FLAGS_eager_delete_tensor_gb = 0 2020-04-23 11:11:09,563-INFO: fail to map op [RandomFlipImage_e2210c] with error: Invalid segm type: <class 'NoneType'> and stack: Traceback (most recent call last): File "/data/ssh/PaddleDetection/ppdet/data/reader.py", line 45, in call data = f(data, ctx) File "/data/ssh/PaddleDetection/ppdet/data/transform/operators.py", line 431, in call height, width) File "/data/ssh/PaddleDetection/ppdet/data/transform/operators.py", line 376, in flip_segms if is_poly(segm): File "/data/ssh/PaddleDetection/ppdet/data/transform/operators.py", line 371, in is_poly "Invalid segm type: {}".format(type(segm)) AssertionError: Invalid segm type: <class 'NoneType'>

2020-04-23 11:11:09,563-WARNING: recv endsignal from outq with errmsg[consumer[consumer-cf8-3] failed to map with error:[Invalid segm type: <class 'NoneType'>]] 2020-04-23 11:11:10,032-INFO: fail to map op [RandomFlipImage_e2210c] with error: Invalid segm type: <class 'NoneType'> and stack: Traceback (most recent call last): File "/data/ssh/PaddleDetection/ppdet/data/reader.py", line 45, in call data = f(data, ctx) File "/data/ssh/PaddleDetection/ppdet/data/transform/operators.py", line 431, in call height, width) File "/data/ssh/PaddleDetection/ppdet/data/transform/operators.py", line 376, in flip_segms if is_poly(segm): File "/data/ssh/PaddleDetection/ppdet/data/transform/operators.py", line 371, in is_poly "Invalid segm type: {}".format(type(segm)) AssertionError: Invalid segm type: <class 'NoneType'>

2020-04-23 11:11:10,033-WARNING: recv endsignal from outq with errmsg[consumer[consumer-cf8-2] failed to map with error:[Invalid segm type: <class 'NoneType'>]] 2020-04-23 11:11:10,035-INFO: fail to map op [RandomFlipImage_e2210c] with error: Invalid segm type: <class 'NoneType'> and stack: Traceback (most recent call last): File "/data/ssh/PaddleDetection/ppdet/data/reader.py", line 45, in call data = f(data, ctx) File "/data/ssh/PaddleDetection/ppdet/data/transform/operators.py", line 431, in call height, width) File "/data/ssh/PaddleDetection/ppdet/data/transform/operators.py", line 376, in flip_segms if is_poly(segm): File "/data/ssh/PaddleDetection/ppdet/data/transform/operators.py", line 371, in is_poly "Invalid segm type: {}".format(type(segm)) AssertionError: Invalid segm type: <class 'NoneType'>

W0423 11:11:20.876386 22791 operator.cc:181] deformable_conv raises an exception paddle::memory::allocation::BadAlloc,


C++ Call Stacks (More useful to developers):

0 std::string paddle::platform::GetTraceBackString(std::string&&, char const, int) 1 paddle::memory::allocation::CUDAAllocator::AllocateImpl(unsigned long) 2 paddle::memory::allocation::AlignedAllocator::AllocateImpl(unsigned long) 3 paddle::memory::allocation::AutoGrowthBestFitAllocator::AllocateImpl(unsigned long) 4 paddle::memory::allocation::Allocator::Allocate(unsigned long) 5 paddle::memory::allocation::RetryAllocator::AllocateImpl(unsigned long) 6 paddle::memory::allocation::AllocatorFacade::Alloc(paddle::platform::Place const&, unsigned long) 7 paddle::memory::Alloc(paddle::platform::Place const&, unsigned long) 8 paddle::memory::Alloc(paddle::platform::DeviceContext const&, unsigned long) 9 paddle::framework::Tensor paddle::framework::ExecutionContext::AllocateTmpTensor<float, paddle::platform::CUDADeviceContext>(paddle::framework::DDim const&, paddle::platform::CUDADeviceContext const&) const 10 paddle::operators::DeformableConvCUDAKernel<paddle::platform::CUDADeviceContext, float>::Compute(paddle::framework::ExecutionContext const&) const 11 std::_Function_handler<void (paddle::framework::ExecutionContext const&), paddle::framework::OpKernelRegistrarFunctor<paddle::platform::CUDAPlace, false, 0ul, paddle::operators::DeformableConvCUDAKernel<paddle::platform::CUDADeviceContext, float> >::operator()(char const, char const, int) const::{lambda(paddle::framework::ExecutionContext const&)#1}>::_M_invoke(std::_Any_data const&, paddle::framework::ExecutionContext const&) 12 paddle::framework::OperatorWithKernel::RunImpl(paddle::framework::Scope const&, paddle::platform::Place const&, paddle::framework::RuntimeContext) const 13 paddle::framework::OperatorWithKernel::RunImpl(paddle::framework::Scope const&, paddle::platform::Place const&) const 14 paddle::framework::OperatorBase::Run(paddle::framework::Scope const&, paddle::platform::Place const&) 15 paddle::framework::details::ComputationOpHandle::RunImpl() 16 paddle::framework::details::FastThreadedSSAGraphExecutor::RunOpSync(paddle::framework::details::OpHandleBase) 17 paddle::framework::details::FastThreadedSSAGraphExecutor::RunOp(paddle::framework::details::OpHandleBase, std::shared_ptr<paddle::framework::BlockingQueue > const&, unsigned long*) 18 std::_Function_handler<std::unique_ptr<std::future_base::_Result_base, std::future_base::_Result_base::_Deleter> (), std::future_base::_Task_setter<std::unique_ptr<std::future_base::_Result, std::future_base::_Result_base::_Deleter>, void> >::_M_invoke(std::_Any_data const&) 19 std::__future_base::_State_base::_M_do_set(std::function<std::unique_ptr<std::future_base::_Result_base, std::__future_base::_Result_base::_Deleter> ()>&, bool&) 20 ThreadPool::ThreadPool(unsigned long)::{lambda()#1}::operator()() const


Error Message Summary:

ResourceExhaustedError:

Out of memory error on GPU 0. Cannot allocate 119.531494MB memory on GPU 0, available memory is only 27.625000MB.

Please check whether there is any other process using GPU 0.

  1. If yes, please stop them, or start PaddlePaddle on another GPU.
  2. If no, please decrease the batch size of your model.

    at (/paddle/paddle/fluid/memory/allocation/cuda_allocator.cc:69) F0423 11:11:20.877792 22791 exception_holder.h:37] std::exception caught,


C++ Call Stacks (More useful to developers):

0 std::string paddle::platform::GetTraceBackString(std::string&&, char const, int) 1 paddle::memory::allocation::CUDAAllocator::AllocateImpl(unsigned long) 2 paddle::memory::allocation::AlignedAllocator::AllocateImpl(unsigned long) 3 paddle::memory::allocation::AutoGrowthBestFitAllocator::AllocateImpl(unsigned long) 4 paddle::memory::allocation::Allocator::Allocate(unsigned long) 5 paddle::memory::allocation::RetryAllocator::AllocateImpl(unsigned long) 6 paddle::memory::allocation::AllocatorFacade::Alloc(paddle::platform::Place const&, unsigned long) 7 paddle::memory::Alloc(paddle::platform::Place const&, unsigned long) 8 paddle::memory::Alloc(paddle::platform::DeviceContext const&, unsigned long) 9 paddle::framework::Tensor paddle::framework::ExecutionContext::AllocateTmpTensor<float, paddle::platform::CUDADeviceContext>(paddle::framework::DDim const&, paddle::platform::CUDADeviceContext const&) const 10 paddle::operators::DeformableConvCUDAKernel<paddle::platform::CUDADeviceContext, float>::Compute(paddle::framework::ExecutionContext const&) const 11 std::_Function_handler<void (paddle::framework::ExecutionContext const&), paddle::framework::OpKernelRegistrarFunctor<paddle::platform::CUDAPlace, false, 0ul, paddle::operators::DeformableConvCUDAKernel<paddle::platform::CUDADeviceContext, float> >::operator()(char const, char const, int) const::{lambda(paddle::framework::ExecutionContext const&)#1}>::_M_invoke(std::_Any_data const&, paddle::framework::ExecutionContext const&) 12 paddle::framework::OperatorWithKernel::RunImpl(paddle::framework::Scope const&, paddle::platform::Place const&, paddle::framework::RuntimeContext) const 13 paddle::framework::OperatorWithKernel::RunImpl(paddle::framework::Scope const&, paddle::platform::Place const&) const 14 paddle::framework::OperatorBase::Run(paddle::framework::Scope const&, paddle::platform::Place const&) 15 paddle::framework::details::ComputationOpHandle::RunImpl() 16 paddle::framework::details::FastThreadedSSAGraphExecutor::RunOpSync(paddle::framework::details::OpHandleBase) 17 paddle::framework::details::FastThreadedSSAGraphExecutor::RunOp(paddle::framework::details::OpHandleBase, std::shared_ptr<paddle::framework::BlockingQueue > const&, unsigned long*) 18 std::_Function_handler<std::unique_ptr<std::future_base::_Result_base, std::future_base::_Result_base::_Deleter> (), std::future_base::_Task_setter<std::unique_ptr<std::future_base::_Result, std::future_base::_Result_base::_Deleter>, void> >::_M_invoke(std::_Any_data const&) 19 std::__future_base::_State_base::_M_do_set(std::function<std::unique_ptr<std::future_base::_Result_base, std::__future_base::_Result_base::_Deleter> ()>&, bool&) 20 ThreadPool::ThreadPool(unsigned long)::{lambda()#1}::operator()() const


Error Message Summary:

ResourceExhaustedError:

Out of memory error on GPU 0. Cannot allocate 119.531494MB memory on GPU 0, available memory is only 27.625000MB.

Please check whether there is any other process using GPU 0.

  1. If yes, please stop them, or start PaddlePaddle on another GPU.
  2. If no, please decrease the batch size of your model.

    at (/paddle/paddle/fluid/memory/allocation/cuda_allocator.cc:69) Check failure stack trace: @ 0x7f73f3ba2c2d google::LogMessage::Fail() @ 0x7f73f3ba66dc google::LogMessage::SendToLog() @ 0x7f73f3ba2753 google::LogMessage::Flush() @ 0x7f73f3ba7bee google::LogMessageFatal::~LogMessageFatal() @ 0x7f73f617c9b8 paddle::framework::details::ExceptionHolder::Catch() @ 0x7f73f622868e paddle::framework::details::FastThreadedSSAGraphExecutor::RunOpSync() @ 0x7f73f622729f paddle::framework::details::FastThreadedSSAGraphExecutor::RunOp() @ 0x7f73f6227564 _ZNSt17_Function_handlerIFvvESt17reference_wrapperISt12_Bind_simpleIFS1_ISt5_BindIFZN6paddle9framework7details28FastThreadedSSAGraphExecutor10RunOpAsyncEPSt13unordered_mapIPNS6_12OpHandleBaseESt6atomicIiESt4hashISA_ESt8equal_toISA_ESaISt4pairIKSA_SC_EEESA_RKSt10shared_ptrINS5_13BlockingQueueImEEEEUlvE_vEEEvEEEE9_M_invokeERKSt9_Any_data @ 0x7f73f3bfb983 std::_Function_handler<>::_M_invoke() @ 0x7f73f3989c37 std::future_base::_State_base::_M_do_set() @ 0x7f7535d2fa99 pthread_once_slow @ 0x7f73f6222a52 _ZNSt13__future_base11_Task_stateISt5_BindIFZN6paddle9framework7details28FastThreadedSSAGraphExecutor10RunOpAsyncEPSt13unordered_mapIPNS4_12OpHandleBaseESt6atomicIiESt4hashIS8_ESt8equal_toIS8_ESaISt4pairIKS8_SA_EEES8_RKSt10shared_ptrINS3_13BlockingQueueImEEEEUlvE_vEESaIiEFvvEE6_M_runEv @ 0x7f73f398be64 _ZZN10ThreadPoolC1EmENKUlvE_clEv @ 0x7f744b6db421 execute_native_thread_routine_compat @ 0x7f7535d286ba start_thread @ 0x7f7535a5e41d clone @ (nil) (unknown) 已放弃 (核心已转储) (paddle) w@i:/data/ssh/PaddleDetection$

yghstill commented 4 years ago

感谢反馈,我们复现一下,稍后给您结果。

FDInSky commented 4 years ago

检查一下你配置的json路径是否正确,或者本地写个简单的demo测一下coco.loadAnns(),看看pycocotools是否正确

ash12358 commented 4 years ago

@Noplz GeForce RTX 2080 Ti

Noplz commented 4 years ago

@Noplz GeForce RTX 2080 Ti

建议你把config里面multiscale training的分辨率那里大于800的都删掉再试试,大概率是显存不够导致的。同时max_size: 这里改成1333

Noplz commented 4 years ago

@Noplz GeForce RTX 2080 Ti

我们这边复现了错误,我们尽快修复下

ash12358 commented 4 years ago

@Noplz ppdet/data/source/coco.py第101行,ins_anno_ids = coco.getAnnIds(imgIds=img_id, iscrowd=False),这里的iscrowd=False引发的问题,有些图片的标注iscrowd全都是1,这样取出的ins_anno_ids就是空集。

yghstill commented 4 years ago

@ash12358 请再拉最新master或者release/0.3分支的代码试下呢