Closed Ritchizh closed 2 years ago
Oh, it seems I finally got it. It's all about the binary masks. For those also struggling: 1) Both _gtids and _predmask have the size of number of all points in the point cloud. 2) Each of them is binary-masked by an instance. 3) Logical 'and' leaves the points present in both instances.
P.S. I have really a hard time understanding that Cityscapes evaluation script inheritance from 2015 going from one 3d segmentation repository into another. Why to pack semantic label and instance id into one variable? Why to call points - mesh vertices?
I'm glad you get the issue solved, and thanks for contributing back. I believe putting the semantic label and instance id into one variable is just a way to save some space for people to submit their results to some online server like the competition we are organizing. There must be other benefits for training and testing as well, but obsessively not for code readability 😵💫 . I think many of the evaluation scripts were inherited from another well-known repository (the evaluation script in this repo was modified from HAIS, and HAIS was modified from Scannet)...
Yep, and ScanNet's script was modified from CitiScapes: https://github.com/ScanNet/ScanNet/blob/2c2f8003e6f4eb122dc96bcb2e072f9813fc73ab/BenchmarkScripts/3d_evaluation/evaluate_semantic_instance.py#L2 So I guess we can establish this to be the source of the ancient evil 😅
Hi! I'm looking into STPLS3DInstanceSegmentationChallenge_Codalab_Evaluate.py Could you please tell me how Intersection is calculated for two instances?
https://github.com/meidachen/STPLS3D/blob/65917491c6a507b97c1d1ed60dcffd418524e3d8/HAIS/STPLS3DInstanceSegmentationChallenge_Codalab_Evaluate.py#L279 Here gt_ids are the ground truth [semantic_label + instance_id] for all points; gt_inst['instance_id'] are [semantic_label + instance_id] matching with selected semantic label; pred_mask is a binary mask of a separate predicted instance;
Do we calculate the number of points that have the same semantic label inside two instances? How do we know they actually intersect, if a single object can have different instance ids in ground truth and prediction, and we don't look at x,y,z as usually is done with bounding boxes in object detection tasks?