MIC-DKFZ / nnDetection

nnDetection is a self-configuring framework for 3D (volumetric) medical object detection which can be applied to new data sets without manual intervention. It includes guides for 12 data sets that were used to develop and evaluate the performance of the proposed method.
Apache License 2.0
536 stars 90 forks source link

[Question]Visualize predicted bounding boxes #261

Open NataliaAlves13 opened 1 month ago

NataliaAlves13 commented 1 month ago

:question: Question

I want to generate a nifti image from the prediction pkl files in order to visualize the predicted bounding box as an image overlay with the ground truth. How are the points stored in the pred_boxes variable? Is it [z_min, y_min, x_min, z_max, y_max, x_max]? That scheme doesn't fit my results (see example below): pred_boxes = [ [ 74.778915 301.09387 81.22717 314.0541 267.7665 280.9015 ] [ 68.93939 304.47952 77.68393 317.9851 217.12233 230.096 ] [ 76.44196 303.1566 81.405975 312.67078 270.30823 279.60272 ] [185.25406 174.5544 192.86487 187.48819 369.2219 382.15186 ]] pred_scores = [0.9844008 0.9062509 0.6013834 0.7847769] original_size_of_raw_data = [208 512 512] itk_origin = (-249.51171875, -393.01171875, -1369.5) itk_spacing = (0.9765629768371582, 0.9765629768371582, 3.0) itk_direction = (1.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 1.0)

partha-ghosh commented 1 month ago

Dear @NataliaAlves13,

Handling the format can be somewhat complicated due to sitk. In the examples below, you can understand how the predictions maintain their relative orderings:

Nifty Image/sitk image Ordering: z, y, x Array Ordering (post GetArrayFromImage): x, y, z Predictions: x_min, y_min, x_max, y_max, z_min, z_max

Nifty ordering: x, y, z Array ordering: z, y, x Predictions: z_min, y_min, z_max, y_max, x_min, x_max

Notice how the indices in the predictions always follow the array format; the first four entries pertain to the first two axes of the array, and the last two entries relate to the final axis of the array.

Regarding the format: The x_min, y_min, x_max, y_max format is widely used in object detection frameworks within the natural image computing field. We extended it in this manner to reuse code from the natural image computing domain. This allows us to easily incorporate the third dimension without rewriting entire functions.

github-actions[bot] commented 1 week ago

This issue is stale because it has been open for 30 days with no activity.