open-mmlab / mmdetection

OpenMMLab Detection Toolbox and Benchmark
https://mmdetection.readthedocs.io
Apache License 2.0
29.32k stars 9.42k forks source link

what's the meaning of data_time and time? #10795

Open lb-hit opened 1 year ago

lb-hit commented 1 year ago

hi, when i test the pretrained model by test.py,the ending number data_time and time make me confused. Does data_time represent the preprocessing time of an image? time represents the total time of an image? time -data_time represents the inference time of an image? Or how should I get the inference time for a picture? Are they in seconds or hours? Could you please tell me the meanings? QQ图片20230816141532

hhaAndroid commented 1 year ago

time represents the total time of a iter. data_time represents the dataloader time of a iter. time=datatime+modeltime

if you want to get the inference time. please use https://github.com/open-mmlab/mmdetection/blob/main/.dev_scripts/benchmark_inference_fps.py

the time is seconds

lb-hit commented 1 year ago

hi, I met another error: cannot import name 'repeat_measure_inference_speed' from 'tools.analysis_tools.benchmark' But benchmark.py(main version or 3.x version) doesn't have the repeat_measure_inference_speed'. but I only find the funtion in mmdet 2x. version. When I copy this Function into benchmark.py(main version or 3.x version), I meet more errors related to the differences between versions.

So, I give up that try, then I try the command in terminal: _python -m torch.distributed.launch --nproc_per_node=1 --master_port=29500 tools/analysistools/benchmark.py $cofig $checkpoint --launcher pytorch I get the following error:

NOTE: Redirects are currently not supported in Windows or MacOs. C:\ProgramData\Anaconda3\envs\mmdet\Lib\site-packages\torch\distributed\launch.py:181: FutureWarning: The module torch.distributed.launch is deprecated and will be removed in future. Use torchrun. Note that --use-env is set by default in torchrun. If your script expects --local-rank argument to be set, please change it to read from os.environ['LOCAL_RANK'] instead. See https://pytorch.org/docs/stable/distributed.html#launch-utility for further instructions warnings.warn( [W C:\cb\pytorch_1000000000000\work\torch\csrc\distributed\c10d\socket.cpp:601] [c10d] The client socket has failed to connect to [DESKTOP-ES88PVG]:29500 (system error: 10049 - 在其上下文中,该请求的地址无效。). [W C:\cb\pytorch_1000000000000\work\torch\csrc\distributed\c10d\socket.cpp:601] [c10d] The client socket has failed to connect to [DESKTOP-ES88PVG]:29500 (system error: 10049 - 在其上下文中,该请求的地址无效。). usage: benchmark.py [-h] [--checkpoint CHECKPOINT] [--task {inference,dataloader,dataset}] [--repeat-num REPEAT_NUM] [--max-iter MAX_ITER] [--log-interval LOG_INTERVAL] [--num-warmup NUM_WARMUP] [--fuse-conv-bn] [--dataset-type {train,val,test}] [--work-dir WORK_DIR] [--cfg-options CFG_OPTIONS [CFG_OPTIONS ...]] [--launcher {none,pytorch,slurm,mpi}] [--local_rank LOCAL_RANK] config benchmark.py: error: unrecognized arguments: --local-rank=0 work_dirs/#O1_1yolox_s_8xb8-300e_coco_FS-0685/best_coco_bbox_mAP_epoch_18.pth ERROR:torch.distributed.elastic.multiprocessing.api:failed (exitcode: 2) local_rank: 0 (pid: 16576) of binary: C:\ProgramData\Anaconda3\envs\mmdet\python.exe Traceback (most recent call last): File "", line 198, in _run_module_as_main File "", line 88, in _run_code File "C:\ProgramData\Anaconda3\envs\mmdet\Lib\site-packages\torch\distributed\launch.py", line 196, in main() File "C:\ProgramData\Anaconda3\envs\mmdet\Lib\site-packages\torch\distributed\launch.py", line 192, in main launch(args) File "C:\ProgramData\Anaconda3\envs\mmdet\Lib\site-packages\torch\distributed\launch.py", line 177, in launch run(args) File "C:\ProgramData\Anaconda3\envs\mmdet\Lib\site-packages\torch\distributed\run.py", line 785, in run elastic_launch( File "C:\ProgramData\Anaconda3\envs\mmdet\Lib\site-packages\torch\distributed\launcher\api.py", line 134, in call return launch_agent(self._config, self._entrypoint, list(args)) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "C:\ProgramData\Anaconda3\envs\mmdet\Lib\site-packages\torch\distributed\launcher\api.py", line 250, in launch_agent raise ChildFailedError( torch.distributed.elastic.multiprocessing.errors.ChildFailedError: tools/analysis_tools/benchmark.py FAILED Failures:

Root Cause (first observed failure): [0]: time : 2023-08-16_15:24:32 host : DESKTOP-ES88PVG rank : 0 (local_rank: 0) exitcode : 2 (pid: 16576) error_file: traceback : To enable traceback see: https://pytorch.org/docs/stable/elastic/errors.html
hhaAndroid commented 1 year ago

@sibet-lb It appears that you have made an error in the input parameters.

lb-hit commented 1 year ago

@hhaAndroid I'm sorry, I don't quite understand the error you're talking about, can you expand on it? Or how should I modify it?