Closed YueXiNPU closed 6 years ago
同问,+1
训练问题:
请确认下你的代码是最新的代码吧? 有加显存优化吧?
https://github.com/PaddlePaddle/models/blob/develop/fluid/face_detection/train.py#L128
请训练前,确保GPU上没有其他任务。单卡请设置:export CUDA_VISIBLE_DEVICES=0,
FLAGS_fraction_of_gpu_memory_to_use
默认值是0.92, 如果设置该值,不能太小,至少是整个模型需要的显存大小。
export FLAGS_fraction_of_gpu_memory_to_use=0.89
等等试下,如果不行,就用默认值就行了
以上还是不行,可以把resize_h, resize_w减小试下,当前默认是640。 https://github.com/PaddlePaddle/models/blob/develop/fluid/face_detection/train.py#L29
评估问题:
也是显存不够,export FLAGS_fraction_of_gpu_memory_to_use=0.1
太小了,建议用默认值。
同样确保评估时GPU没有其他任务占用。
如果还是不行,可以在 https://github.com/PaddlePaddle/models/blob/develop/fluid/face_detection/widerface_eval.py#L311
之后加入显存优化策略试下:
infer_program, nmsed_out = network.infer(main_program)
fluid.memory_optimize(infer_program)
非常感谢您耐心的答疑,问题已经得到解决,再次向您表示感谢。我们希望以您的工作作为我们工作的起点,希望可以在数据集中得到更好的结果。
----------- Configuration Arguments -----------
batch_num: None
batch_size: 4
data_dir: data
enable_ce: False
epoc_num: 160
learning_rate: 0.001
mean_BGR: 104., 117., 123.
model_save_dir: output
num_devices: 1
parallel: True
pretrained_model: vgg_ilsvrc_16_fc_reduced
resize_h: 640
resize_w: 640
use_gpu: True
use_multiprocess: True
use_pyramidbox: True
------------------------------------------------
Traceback (most recent call last):
File "train.py", line 284, in <module>
train(args, config, train_parameters, train_file_list)
File "train.py", line 157, in train
exe.run(startup_prog)
File "/usr/local/lib/python3.6/dist-packages/paddle/fluid/executor.py", line 651, in run
use_program_cache=use_program_cache)
File "/usr/local/lib/python3.6/dist-packages/paddle/fluid/executor.py", line 749, in _run
exe.run(program.desc, scope, 0, True, True, fetch_var_name)
paddle.fluid.core_avx.EnforceNotMet: out of memory at [/paddle/paddle/fluid/platform/device_context.cc:243]
PaddlePaddle Call Stacks:
0 0x7fa1d9007890p void paddle::platform::EnforceNotMet::Init<char const*>(char const*, char const*, int) + 352
1 0x7fa1d9007c09p paddle::platform::EnforceNotMet::EnforceNotMet(std::__exception_ptr::exception_ptr, char const*, int) + 137
2 0x7fa1db1160f3p paddle::platform::CUDADeviceContext::CUDADeviceContext(paddle::platform::CUDAPlace) + 3203
3 0x7fa1db11a518p std::_Function_handler<std::unique_ptr<paddle::platform::DeviceContext, std::default_delete<paddle::platform::DeviceContext> > (), std::reference_wrapper<std::_Bind_simple<paddle::platform::EmplaceDeviceContext<paddle::platform::CUDADeviceContext, paddle::platform::CUDAPlace>(std::map<boost::variant<paddle::platform::CUDAPlace, paddle::platform::CPUPlace, paddle::platform::CUDAPinnedPlace, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_>, std::shared_future<std::unique_ptr<paddle::platform::DeviceContext, std::default_delete<paddle::platform::DeviceContext> > >, std::less<boost::variant<paddle::platform::CUDAPlace, paddle::platform::CPUPlace, paddle::platform::CUDAPinnedPlace, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_> >, std::allocator<std::pair<boost::variant<paddle::platform::CUDAPlace, paddle::platform::CPUPlace, paddle::platform::CUDAPinnedPlace, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_> const, std::shared_future<std::unique_ptr<paddle::platform::DeviceContext, std::default_delete<paddle::platform::DeviceContext> > > > > >*, boost::variant<paddle::platform::CUDAPlace, paddle::platform::CPUPlace, paddle::platform::CUDAPinnedPlace, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_>)::{lambda()#1} ()> > >::_M_invoke(std::_Any_data const&) + 104
4 0x7fa1db118acap std::_Function_handler<std::unique_ptr<std::__future_base::_Result_base, std::__future_base::_Result_base::_Deleter> (), std::__future_base::_Task_setter<std::unique_ptr<std::__future_base::_Result<std::unique_ptr<paddle::platform::DeviceContext, std::default_delete<paddle::platform::DeviceContext> > >, std::__future_base::_Result_base::_Deleter>, std::unique_ptr<paddle::platform::DeviceContext, std::default_delete<paddle::platform::DeviceContext> > > >::_M_invoke(std::_Any_data const&) + 42
5 0x7fa1d90d15a7p std::__future_base::_State_base::_M_do_set(std::function<std::unique_ptr<std::__future_base::_Result_base, std::__future_base::_Result_base::_Deleter> ()>&, bool&) + 39
6 0x7fa22ad73827p
7 0x7fa1db11be0cp std::__future_base::_Deferred_state<std::_Bind_simple<paddle::platform::EmplaceDeviceContext<paddle::platform::CUDADeviceContext, paddle::platform::CUDAPlace>(std::map<boost::variant<paddle::platform::CUDAPlace, paddle::platform::CPUPlace, paddle::platform::CUDAPinnedPlace, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_>, std::shared_future<std::unique_ptr<paddle::platform::DeviceContext, std::default_delete<paddle::platform::DeviceContext> > >, std::less<boost::variant<paddle::platform::CUDAPlace, paddle::platform::CPUPlace, paddle::platform::CUDAPinnedPlace, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_> >, std::allocator<std::pair<boost::variant<paddle::platform::CUDAPlace, paddle::platform::CPUPlace, paddle::platform::CUDAPinnedPlace, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_> const, std::shared_future<std::unique_ptr<paddle::platform::DeviceContext, std::default_delete<paddle::platform::DeviceContext> > > > > >*, boost::variant<paddle::platform::CUDAPlace, paddle::platform::CPUPlace, paddle::platform::CUDAPinnedPlace, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_>)::{lambda()#1} ()>, std::unique_ptr<paddle::platform::DeviceContext, std::default_delete<paddle::platform::DeviceContext> > >::_M_run_deferred() + 220
8 0x7fa1db116289p paddle::platform::DeviceContextPool::Get(boost::variant<paddle::platform::CUDAPlace, paddle::platform::CPUPlace, paddle::platform::CUDAPinnedPlace, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_> const&) + 137
9 0x7fa1dafa4dedp paddle::framework::GarbageCollector::GarbageCollector(boost::variant<paddle::platform::CUDAPlace, paddle::platform::CPUPlace, paddle::platform::CUDAPinnedPlace, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_> const&, unsigned long) + 477
10 0x7fa1dafa5071p paddle::framework::UnsafeFastGPUGarbageCollector::UnsafeFastGPUGarbageCollector(paddle::platform::CUDAPlace const&, unsigned long) + 33
11 0x7fa1d91925fep paddle::framework::Executor::RunPreparedContext(paddle::framework::ExecutorPrepareContext*, paddle::framework::Scope*, bool, bool, bool) + 238
12 0x7fa1d919572fp paddle::framework::Executor::Run(paddle::framework::ProgramDesc const&, paddle::framework::Scope*, int, bool, bool, std::vector<std::string, std::allocator<std::string> > const&, bool) + 143
13 0x7fa1d8ff8f1dp
14 0x7fa1d903a1a6p
15 0x5674fcp _PyCFunction_FastCallDict + 860
16 0x50abb3p
17 0x50c5b9p _PyEval_EvalFrameDefault + 1097
18 0x508245p
19 0x50a080p
20 0x50aa7dp
21 0x50d390p _PyEval_EvalFrameDefault + 4640
22 0x508245p
23 0x50a080p
24 0x50aa7dp
25 0x50c5b9p _PyEval_EvalFrameDefault + 1097
26 0x508245p
27 0x50a080p
28 0x50aa7dp
29 0x50c5b9p _PyEval_EvalFrameDefault + 1097
30 0x508245p
31 0x50b403p PyEval_EvalCode + 35
32 0x635222p
33 0x6352d7p PyRun_FileExFlags + 151
34 0x638a8fp PyRun_SimpleFileExFlags + 383
35 0x639631p Py_Main + 1425
36 0x4b0f40p main + 224
37 0x7fa22afa4b97p __libc_start_main + 231
38 0x5b2fdap _start + 42
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 440.33.01 Driver Version: 440.33.01 CUDA Version: 10.2 |
|-------------------------------+----------------------+----------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
|===============================+======================+======================|
| 0 GeForce RTX 208... On | 00000000:01:00.0 Off | N/A |
| 42% 52C P2 40W / 260W | 10927MiB / 11019MiB | 0% Default |
+-------------------------------+----------------------+----------------------+
+-----------------------------------------------------------------------------+
| Processes: GPU Memory |
| GPU PID Type Process name Usage |
|=============================================================================|
最近,部署PyramidBox(ECCV 2018)工作。 GPU信息:GTX 1080TI, 11G显存,3块。
问题一, 有人配置,PyramidBox在单卡(GTX 1080TI, 11G显存),是否可以成功运行?
问题二,使用官网发布的完整模型进行测试,尝试一块卡或者3块卡执行。执行语句为
python -u widerface_eval.py --model_dir=output/159 --pred_dir=pred。 本人已经尝试:export FLAGS_fraction_of_gpu_memory_to_use=0.1,没有效果!
错误如下: