junjiehe96 / FastInst

[CVPR2023] FastInst: A Simple Query-Based Model for Real-Time Instance Segmentation
MIT License
178 stars 16 forks source link

V100无法复现您的帧率 #38

Open jiajia131 opened 6 months ago

jiajia131 commented 6 months ago

Originally posted by @junjiehe96 in https://github.com/junjiehe96/FastInst/issues/37#issuecomment-2028095580

        +---------------------------------------------------------------------------------------+

| NVIDIA-SMI 530.30.02 Driver Version: 530.30.02 CUDA Version: 12.1 | |-----------------------------------------+----------------------+----------------------+ | GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC | | Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. | | | | MIG M. | |=========================================+======================+======================| | 0 Tesla V100-PCIE-32GB On | 00000000:07:00.0 Off | 0 | | N/A 31C P0 37W / 250W| 5146MiB / 32768MiB | 4% Default | | | | N/A | +-----------------------------------------+----------------------+----------------------+

[04/06 17:46:17 d2.evaluation.evaluator]: Inference done 370/5000. Dataloading: 0.0022 s/iter. Inference: 0.0564 s/iter. Eval: 0.1694 s/iter. Total: 0.2281 s/iter. ETA=0:17:36 [04/06 17:46:22 d2.evaluation.evaluator]: Inference done 391/5000. Dataloading: 0.0022 s/iter. Inference: 0.0567 s/iter. Eval: 0.1699 s/iter. Total: 0.2289 s/iter. ETA=0:17:35 [04/06 17:46:27 d2.evaluation.evaluator]: Inference done 412/5000. Dataloading: 0.0022 s/iter. Inference: 0.0568 s/iter. Eval: 0.1704 s/iter. Total: 0.2296 s/iter. ETA=0:17:33 [04/06 17:46:32 d2.evaluation.evaluator]: Inference done 433/5000. Dataloading: 0.0023 s/iter. Inference: 0.0567 s/iter. Eval: 0.1712 s/iter. Total: 0.2303 s/iter. ETA=0:17:31 [04/06 17:46:37 d2.evaluation.evaluator]: Inference done 457/5000. Dataloading: 0.0022 s/iter. Inference: 0.0565 s/iter. Eval: 0.1706 s/iter. Total: 0.2295 s/iter. ETA=0:17:22 [04/06 17:46:42 d2.evaluation.evaluator]: Inference done 480/5000. Dataloading: 0.0022 s/iter. Inference: 0.0564 s/iter. Eval: 0.1704 s/iter. Total: 0.2292 s/iter. ETA=0:17:16

        请问,为什么我用V100无法复现你的帧率?完全使用fastinst_R50-vd-dcn_ppm-fpn_x3_640.yaml默认设置只有不到20帧。
jiajia131 commented 6 months ago

[04/07 04:39:41 d2.evaluation.evaluator]: Total inference time: 0:16:15.725855 (0.195341 s / iter per device, on 1 devices) [04/07 04:39:41 d2.evaluation.evaluator]: Total inference pure compute time: 0:04:38 (0.055843 s / iter per device, on 1 devices)

junjiehe96 commented 6 months ago

能提供你的详细测试脚本和log日志吗

jiajia131 commented 6 months ago

能提供你的详细测试脚本和log日志吗

log.txt python train_net.py --eval-only --num-gpus 1 --config-file ./configs/coco/instance-segmentation/fastinst_R50-vd-dcn_ppm-fpn_x3_640.yaml MODEL.WEIGHTS ../fastinst_R50-vd-dcn_ppm-fpn_x3_640_40.1.pth

谢谢您,这是使用 Tesla V100-SXM2-16GB

junjiehe96 commented 6 months ago

能提供你的详细测试脚本和log日志吗

log.txt python train_net.py --eval-only --num-gpus 1 --config-file ./configs/coco/instance-segmentation/fastinst_R50-vd-dcn_ppm-fpn_x3_640.yaml MODEL.WEIGHTS ../fastinst_R50-vd-dcn_ppm-fpn_x3_640_40.1.pth

谢谢您,这是使用 Tesla V100-SXM2-16GB

您能用下面这条命令再测一下吗 CUDA_VISIBLE_DEVICES=0 python tools/analyze_model.py --tasks speed \ --config-file ./configs/coco/instance-segmentation/fastinst_R50-vd-dcn_ppm-fpn_x3_640.yaml

jiajia131 commented 6 months ago

能提供你的详细测试脚本和log日志吗

log.txt python train_net.py --eval-only --num-gpus 1 --config-file ./configs/coco/instance-segmentation/fastinst_R50-vd-dcn_ppm-fpn_x3_640.yaml MODEL.WEIGHTS ../fastinst_R50-vd-dcn_ppm-fpn_x3_640_40.1.pth 谢谢您,这是使用 Tesla V100-SXM2-16GB

您能用下面这条命令再测一下吗 CUDA_VISIBLE_DEVICES=0 python tools/analyze_model.py --tasks speed --config-file ./configs/coco/instance-segmentation/fastinst_R50-vd-dcn_ppm-fpn_x3_640.yaml

CUDA_VISIBLE_DEVICES=0 python tools/analyze_model.py --tasks speed --config-file ./configs/coco/instance-segmentation/fastinst_R50-vd-dcn_ppm-fpn_x3_640.yaml MODEL.WEIGHTS ./fastinst_R50-vd-dcn_ppm-fpn_x3_640_40.1.pth

用这个会更慢了 [04/07 06:20:28 d2.checkpoint.detection_checkpoint]: [DetectionCheckpointer] Loading from /hy-tmp/FastInst-main-240320/fastinst_R50-vd-dcn_ppm-fpn_x3_640_40.1.pth ... [04/07 06:20:28 fvcore.common.checkpoint]: [Checkpointer] Loading from /hy-tmp/FastInst-main-240320/fastinst_R50-vd-dcn_ppm-fpn_x3_640_40.1.pth ... [04/07 06:20:28 d2.evaluation.evaluator]: Start inference on 5000 batches [04/07 06:20:31 d2.evaluation.evaluator]: Inference done 11/5000. Dataloading: 0.0149 s/iter. Inference: 0.1241 s/iter. Eval: 0.0000 s/iter. Total: 0.1391 s/iter. ETA=0:11:33 [04/07 06:20:36 d2.evaluation.evaluator]: Inference done 47/5000. Dataloading: 0.0247 s/iter. Inference: 0.1146 s/iter. Eval: 0.0000 s/iter. Total: 0.1394 s/iter. ETA=0:11:30 [04/07 06:20:41 d2.evaluation.evaluator]: Inference done 84/5000. Dataloading: 0.0277 s/iter. Inference: 0.1118 s/iter. Eval: 0.0000 s/iter. Total: 0.1396 s/iter. ETA=0:11:26 [04/07 06:20:46 d2.evaluation.evaluator]: Inference done 120/5000. Dataloading: 0.0276 s/iter. Inference: 0.1127 s/iter. Eval: 0.0000 s/iter. Total: 0.1404 s/iter. ETA=0:11:24 [04/07 06:20:51 d2.evaluation.evaluator]: Inference done 155/5000. Dataloading: 0.0293 s/iter. Inference: 0.1122 s/iter. Eval: 0.0000 s/iter. Total: 0.1416 s/iter. ETA=0:11:25 [04/07 06:20:56 d2.evaluation.evaluator]: Inference done 191/5000. Dataloading: 0.0298 s/iter. Inference: 0.1117 s/iter. Eval: 0.0000 s/iter. Total: 0.1416 s/iter. ETA=0:11:21 [04/07 06:21:01 d2.evaluation.evaluator]: Inference done 226/5000. Dataloading: 0.0296 s/iter. Inference: 0.1122 s/iter. Eval: 0.0000 s/iter. Total: 0.1419 s/iter. ETA=0:11:17

junjiehe96 commented 6 months ago

这看起来有些奇怪。不过帧率跟模型运行时具体的硬件和软件环境有关,可以在同一台机器上测一下其他方法以进行公平比较