location of inference results (bbox coordinates) - where?

romanon5 commented 2 years ago

I am running an inference (runmode=saved_model_infer, with d7x version) on my own images. They are located in "testdata" dir. The inference is successful, with the annotated images written to "serve_image_out" dir.

But, I can't find the text/ow file that contains the detections' coordinates (to compare the results to my ground truth). Where are they located?

daunnn commented 2 years ago

I'm thinking about the same problem. Did you solve it?

romanon5 commented 2 years ago

Yes, I believe I did. I don't think there is an output file with the info - or at least I couldn't find it. But, I found (as it seems) the location in the code where the final bounding boxes are drawn on the output images. I added some code, to create a text file with the bboxes output data of the inference.

It seems that the final results are inside the vis_utils.py, function draw_bounding_box_on_image. At the end of the function the needed info are the variables: display_str (predicted class and % of confidence), left, right, top, bottom (bbox coordinates) - acquired in a for loop, values for every detected object in the image.

(full functions path: model_inspect.py: run_model -> saved_model_inference -> driver.visualize, which is inside inference.py, then: visualize_image_prediction -> visualize_image -> vis_utils.visualize_boxes_and_labels_on_image_array, which is inside vis_utils.py, then: draw_bounding_box_on_image_array -> draw_bounding_box_on_image)

----------- EXTRA ----------------------------------------------------------------------- I added a creation of a text file in model_inspect.py, in line 171:

import time
ts = time.time()
opfilepath = output_dir + '_' + str(ts) + '.txt'
print('OPENING FILE')
output_file = open(opfilepath, 'a')
output_file.write('img_id str xmin xmax ymin ymax\n')  # headers

and its closing after the end of the for loop that starts in line 178:

print('CLOSING FILE')
output_file.close()

Then I added two arguments to all function definitions and uses in the full functions path mentioned earlier - img_id, to know with which frame a detection is associated, and output_file, to write the detection info.

About the writing itself, I added it to the end of the draw_bounding_box_on_image function (the final function in the functions path), line ~241 in vis_utils.py. Looks like this:

line_of_output = ' '.join([str(img_id),
                               str(display_str),
                               str(round(left)),
                               str(round(right)),
                               str(round(top)),
                               str(round(bottom))])
line_of_output = line_of_output + '\n'
output_file.write(line_of_output)

(note that the predicted class string in display_str can have a space character inside it ('fire hydrant' for example), so it might be a better idea to join the info in each line with some special character or string instead of a space character ' ')

daunnn commented 2 years ago

Thank you for your help. Previously, there was a problem because the image id could not be saved continuously, but this time there is an error that the image_id could not be found in the draw_bounding_box_on_image function. How did you correct this error?

daunnn commented 2 years ago

Can you also explain the part where you add two arguments (img_id and output_file) ?

romanon5 commented 2 years ago

Hi, currently I don't have the time to elaborate more. Will try to edit this comment later with a more thorough explanation. In short, to pass the image_id (and output_file) from the start to the final function, it needs (to my understanding and implementation) to be passed through all the functions in the function path. In function definitions, for example the visualize_image function (in automl/efficientdet/inference.py), I added the two to the parentheses like this: *The triple asterisks around the arguments are only to emphasize the added ones, and not part of the code

def visualize_image(***output_file***, ***img_id***, image,
                    boxes,
                    classes,
                    scores,
                    label_map=None,
                    min_score_thresh=0.01,
                    max_boxes_to_draw=1000,
                    line_thickness=2,
                    **kwargs):
  """Visualizes a given image.
.
.
.

and in the call/use of the function, in the visualize_image_prediction function for this example, the two arguments are passed, also at the start of the parentheses in the same order:

return visualize_image(***output_file***, ***img_id***, image, boxes, classes, scores, label_map, **kwargs)

From where did these values get to the visualize_image_prediction function, to be passed further to the visualize_image function? From a previous function that called visualize_image_prediction itself (I think driver.visualize function from inference.py)

google / automl

location of inference results (bbox coordinates) - where? #1130