evluation gets stuck - Githubissues

twangnh commented 3 months ago

Hi @yhcao6 we are using Detectron2-V3Det and running the evluation with python tools/train_detic.py --config-file projects/Detic/configs/ovd/BoxSup-C2_V3Det-OVD-Base_CLIP_R5021k_640b64_4x.yaml --num-gpus 1 --eval-only MODEL.WEIGHTS output/Detic/BoxSup-C2_V3Det-OVD-Base_CLIP_R5021k_640b64_4x/model_final.pth however the evluation stuck at final phase, even using only 100 images for evaluation, it remains the same:

return _VF.meshgrid(tensors, *kwargs) # type: ignore[attr-defined] [04/12 16:08:15 detic.evaluation.evaluator]: Inference done 11/100. Dataloading: 0.0009 s/iter. Inference: 0.0507 s/iter. Eval: 0.0004 s/iter. Total: 0.0519 s/iter. ETA=0:00:04 [04/12 16:08:18 detic.evaluation.evaluator]: Total inference time: 0:00:03.980425 (0.041899 s / iter per device, on 1 devices) [04/12 16:08:18 detic.evaluation.evaluator]: Total inference pure compute time: 0:00:03 (0.039781 s / iter per device, on 1 devices) [04/12 16:08:18 detic.evaluation.coco_evaluation]: Preparing results for COCO format ... [04/12 16:08:18 detic.evaluation.coco_evaluation]: Saving results to ./output/Detic/BoxSup-C2_V3Det-OVD-Base_CLIP_R5021k_640b64_4x/inference_v3det_val/coco_instances_results.json [04/12 16:08:19 detic.evaluation.coco_evaluation]: Evaluating predictions with official COCO API... Loading and preparing results... DONE (t=0.38s) creating index... index created! Running per image evaluation... Evaluate annotation type bbox* 04/12 16:08:19 - mmengine - INFO - start multi processing evaluation with nproc: 8... 100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 196818600/196818600 [07:34<00:00, 432956.50it/s] 100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 196818600/196818600 [07:34<00:00, 433140.05it/s]

yhcao6 commented 3 months ago

Could you please try this script: https://github.com/V3Det/V3Det/blob/main/evaluation/eval_v3det.py and change num_proc=8 to num_proc=1 in this line https://github.com/V3Det/V3Det/blob/d8f557fdea6b9bd52365a012aa2e94c10a118a8f/evaluation/eval_v3det.py#L17

python eval_v3det.py ./output/Detic/BoxSup-C2_V3Det-OVD-Base_CLIP_R5021k_640b64_4x/inference_v3det_val/coco_instances_results.json

twangnh commented 3 months ago

I tried the script to directly evluate given json file, still got stuck:

loading annotations into memory... Done (t=0.91s) creating index... index created! Loading and preparing results... DONE (t=0.34s) creating index... index created! Running per image evaluation... Evaluate annotation type bbox 04/12 19:50:36 - mmengine - INFO - start multi processing evaluation with nproc: 1... 88%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████▋ | 1383927539/1575025936 [55:43<06:44, 472052.43it/s]

yhcao6 commented 3 months ago

It is strange that the evaluation program runs well on our server. To debug the exact error, I would recommend removing the multi-processing. To do this, you can replace these lines https://github.com/V3Det/V3Det/blob/d8f557fdea6b9bd52365a012aa2e94c10a118a8f/evaluation/cocoeval_mp.py#L164-L179 to

self.evalImgs = self.evalImgs(catIds)[0]

Please try a small number of images first to save time.

V3Det / Detectron2-V3Det

evluation gets stuck #4