Build a simple algorithm to assess the performance of our automated image segmentation algorithm via comparison with manually-performed segmentation of the same image. The algorithm will take as input two sets of coordinates -one produced computationally and the other manually - and produce as output an assessment of the computational segmentation based on the manual ground truth. At minimum, this should contain standard performance metrics like precision, recall, f-score, etc., and ideally it might also produce more specific, human-readable comments (i.e. "failed to segment marginal notation from main entry of body", "failed to detect vertical break between consecutive entries", etc.).
take as input two lists of arbitrary length, representing user-created and algorithmically-identified rectangles, respectively
each element in each list is itself a list with four elements, representing the top, left, bottom, and right coordinates of a rectangular image region
for each algorithmically-identified rectangle, determine whether or not it "matches" one created by a user
make this determination by calculating the intersection over union (IoU) for each possible algorithm-user rectangle pair and comparing the IoU to a threshold (default value of .5, but should be configurable when calling function)
finally, produce precision, recall, and f-score measurements based on the number of true positive/negative and false positive/negative algorithmically-identified rectangles as dictated by the "matching" logic above
Build a simple algorithm to assess the performance of our automated image segmentation algorithm via comparison with manually-performed segmentation of the same image. The algorithm will take as input two sets of coordinates -one produced computationally and the other manually - and produce as output an assessment of the computational segmentation based on the manual ground truth. At minimum, this should contain standard performance metrics like precision, recall, f-score, etc., and ideally it might also produce more specific, human-readable comments (i.e. "failed to segment marginal notation from main entry of body", "failed to detect vertical break between consecutive entries", etc.).