Closed before31 closed 8 months ago
这个问题是没有正确配置train_dataloader导致的,可以参考ocr模型自动压缩示例进行配置。
重点参考示例的run.py中的 reader_wrapper函数重新封装下train_loader,再传入自动压缩训练。
参照您给的例子修改以后,能运行起来了,但是压缩过程中仍然报错:
W0117 17:15:00.967579 13699 gpu_resources.cc:61] Please NOTE: device: 0, GPU Compute Capability: 7.0, Driver API Version: 11.2, Runtime API Version: 11.2
W0117 17:15:00.977397 13699 gpu_resources.cc:91] device: 0, cuDNN Version: 8.2.
2023-01-17 17:15:47,899-INFO: devices: gpu
2023-01-17 17:17:15,078-INFO: Selected strategies: ['ptq_hpo']
INFO:smac.utils.io.cmd_reader.CMDReader:Output to smac3-output_2023-01-17-09:20:06
INFO:smac.facade.smac_hpo_facade.SMAC4HPO:Optimizing a deterministic scenario for quality without a tuner timeout - will make SMAC deterministic and only evaluate one configuration per iteration!
INFO:smac.initial_design.sobol_design.SobolDesign:Running initial design for 1 configurations
INFO:smac.facade.smac_hpo_facade.SMAC4HPO:<class 'smac.facade.smac_hpo_facade.SMAC4HPO'>
Tue Jan 17 17:20:06-INFO: Load model and set data loader ...
Tue Jan 17 17:20:07-INFO: Collect quantized variable names ...
Sampling stage, Run batch:|██████████████████████████████████████████████| 10/10
Tue Jan 17 17:22:19-INFO: Update the program ...
Adding quant op with weight:|██████████████████████████████████████████| 320/320
Adding quant activation op:| | 1/688
Tue Jan 17 17:22:39-INFO: The quantized model is saved in quant_model_tmp
ERROR:smac.tae.execute_func.ExecuteTAFuncDict:'NoneType' object is not callable
Traceback (most recent call last):
File "/usr/local/lib/python3.7/dist-packages/paddleslim/quant/post_quant_hpo.py", line 301, in quantize
emd_loss = eval_quant_model()
File "/usr/local/lib/python3.7/dist-packages/paddleslim/quant/post_quant_hpo.py", line 226, in eval_quant_model
out_float = convert_model_out_2_nparr(out_float)
File "/usr/local/lib/python3.7/dist-packages/paddleslim/quant/post_quant_hpo.py", line 179, in convert_model_out_2_nparr
out_nparr = np.concatenate(out_list)
File "<__array_function__ internals>", line 6, in concatenate
ValueError: all the input array dimensions for the concatenation axis must match exactly, but along dimension 1, the array at index 0 has size 4 and the array at index 1 has size 37
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/usr/local/lib/python3.7/dist-packages/smac/tae/execute_func.py", line 217, in run
rval = self._call_ta(self._ta, config, obj_kwargs)
File "/usr/local/lib/python3.7/dist-packages/smac/tae/execute_func.py", line 314, in _call_ta
return obj(config, **obj_kwargs)
File "/usr/local/lib/python3.7/dist-packages/paddleslim/quant/post_quant_hpo.py", line 313, in quantize
feed_target_names, fetch_targets)
TypeError: 'NoneType' object is not callable
2023-01-17 17:22:46,364-INFO: Value for default configuration: 2147483647.00000000
INFO:smac.optimizer.smbo.SMBO:Running initial design
INFO:smac.intensification.intensification.Intensifier:First run, no incumbent provided; challenger is assumed to be the incumbent
Tue Jan 17 17:22:46-INFO: Load model and set data loader ...
Tue Jan 17 17:22:47-INFO: Collect quantized variable names ...
Sampling stage, Run batch:|██████████████████████████████████████████████| 11/11
Tue Jan 17 17:25:03-INFO: Update the program ...
Adding quant op with weight:|██████████████████████████████████████████| 320/320
Adding quant activation op:| | 1/688
Tue Jan 17 17:25:20-INFO: The quantized model is saved in quant_model_tmp
ERROR:smac.tae.execute_func.ExecuteTAFuncDict:'NoneType' object is not callable
Traceback (most recent call last):
File "/usr/local/lib/python3.7/dist-packages/paddleslim/quant/post_quant_hpo.py", line 301, in quantize
emd_loss = eval_quant_model()
File "/usr/local/lib/python3.7/dist-packages/paddleslim/quant/post_quant_hpo.py", line 226, in eval_quant_model
out_float = convert_model_out_2_nparr(out_float)
File "/usr/local/lib/python3.7/dist-packages/paddleslim/quant/post_quant_hpo.py", line 179, in convert_model_out_2_nparr
out_nparr = np.concatenate(out_list)
File "<__array_function__ internals>", line 6, in concatenate
ValueError: all the input array dimensions for the concatenation axis must match exactly, but along dimension 1, the array at index 0 has size 4 and the array at index 1 has size 37
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/usr/local/lib/python3.7/dist-packages/smac/tae/execute_func.py", line 217, in run
rval = self._call_ta(self._ta, config, obj_kwargs)
File "/usr/local/lib/python3.7/dist-packages/smac/tae/execute_func.py", line 314, in _call_ta
return obj(config, **obj_kwargs)
File "/usr/local/lib/python3.7/dist-packages/paddleslim/quant/post_quant_hpo.py", line 313, in quantize
feed_target_names, fetch_targets)
TypeError: 'NoneType' object is not callable
INFO:smac.stats.stats.Stats:---------------------STATISTICS---------------------
INFO:smac.stats.stats.Stats:Incumbent changed: -1
INFO:smac.stats.stats.Stats:Submitted target algorithm runs: 1 / 3.0
INFO:smac.stats.stats.Stats:Finished target algorithm runs: 1 / 3.0
INFO:smac.stats.stats.Stats:Configurations: 1
INFO:smac.stats.stats.Stats:Used wallclock time: 161.08 / inf sec
INFO:smac.stats.stats.Stats:Used target algorithm runtime: 160.96 / inf sec
INFO:smac.stats.stats.Stats:----------------------------------------------------
INFO:smac.facade.smac_hpo_facade.SMAC4HPO:Final Incumbent: None
Traceback (most recent call last):
File "/usr/lib/python3.7/runpy.py", line 193, in _run_module_as_main
"__main__", mod_spec)
File "/usr/lib/python3.7/runpy.py", line 85, in _run_code
exec(code, run_globals)
File "/root/.vscode-server/extensions/ms-python.python-2022.20.2/pythonFiles/lib/python/debugpy/adapter/../../debugpy/launcher/../../debugpy/__main__.py", line 39, in <module>
cli.main()
File "/root/.vscode-server/extensions/ms-python.python-2022.20.2/pythonFiles/lib/python/debugpy/adapter/../../debugpy/launcher/../../debugpy/../debugpy/server/cli.py", line 430, in main
run()
File "/root/.vscode-server/extensions/ms-python.python-2022.20.2/pythonFiles/lib/python/debugpy/adapter/../../debugpy/launcher/../../debugpy/../debugpy/server/cli.py", line 284, in run_file
runpy.run_path(target, run_name="__main__")
File "/root/.vscode-server/extensions/ms-python.python-2022.20.2/pythonFiles/lib/python/debugpy/_vendored/pydevd/_pydevd_bundle/pydevd_runpy.py", line 322, in run_path
pkg_name=pkg_name, script_name=fname)
File "/root/.vscode-server/extensions/ms-python.python-2022.20.2/pythonFiles/lib/python/debugpy/_vendored/pydevd/_pydevd_bundle/pydevd_runpy.py", line 136, in _run_module_code
mod_name, mod_spec, pkg_name, script_name)
File "/root/.vscode-server/extensions/ms-python.python-2022.20.2/pythonFiles/lib/python/debugpy/_vendored/pydevd/_pydevd_bundle/pydevd_runpy.py", line 124, in _run_code
exec(code, run_globals)
File "/paddle/PaddleOCR/act.py", line 57, in <module>
ac.compress()
File "/usr/local/lib/python3.7/dist-packages/paddleslim/auto_compression/compressor.py", line 594, in compress
train_config)
File "/usr/local/lib/python3.7/dist-packages/paddleslim/auto_compression/compressor.py", line 739, in single_strategy_compress
runcount_limit=config.max_quant_count)
File "/usr/local/lib/python3.7/dist-packages/paddleslim/quant/post_quant_hpo.py", line 535, in quant_post_hpo
incumbent = smac.optimize()
File "/usr/local/lib/python3.7/dist-packages/smac/facade/smac_ac_facade.py", line 723, in optimize
incumbent = self.solver.run()
File "/usr/local/lib/python3.7/dist-packages/smac/optimizer/smbo.py", line 307, in run
self._incorporate_run_results(run_info, result, time_left)
File "/usr/local/lib/python3.7/dist-packages/smac/optimizer/smbo.py", line 513, in _incorporate_run_results
"'abort_on_first_run_crash'). Additional run info: %s" % result.additional_info
smac.tae.FirstRunCrashedException: First run crashed, abort. Please check your setup -- we assume that your default configuration does not crashes. (To deactivate this exception, use the SMAC scenario option 'abort_on_first_run_crash'). Additional run info: {}
另外,您给的参考目录下的readme.md文件内容是缺失的,没有3.4小节。
ValueError: all the input array dimensions for the concatenation axis must match exactly, but along dimension 1, the array at index 0 has size 4 and the array at index 1 has size 37
看这的报错提示似乎是模型输出大小不一致导致的,具体是哪个模型呀?
也可以尝试下自动压缩量化训练的效果,看看还会不会有输出导致的问题。
另外,您给的参考目录下的readme.md文件内容是缺失的,没有3.4小节。
感谢指出文档的不足~
看这的报错提示似乎是模型输出大小不一致导致的,具体是哪个模型呀?
是ppocr里面的pgnet模型。
也可以尝试下自动压缩量化训练的效果,看看还会不会有输出导致的问题。
我尝试过量化训练,也是走不通。#1628
环境
问题重现步骤
自动压缩代码:
请问应如何进一步排查问题?