Rescale localizations for Sharktopoda 2

kevinsbarnard commented 1 year ago

Currently, the bounding box dimensions are used as-is when fed into Sharktopoda 2. But, we may be using a different-resolution target video relative to the resolution of the source video/image. We need to rescale the bounding box appropriately.

kevinsbarnard commented 1 year ago

This also applies for different-resolution framegrabs. We also cannot assume the difference in dimensions is purely in scale; the box dimension transform can have a constant offset. For example, with letterboxed framegrabs, the bounding box will need to be offset before a scaling is applied to its dimensions.

In other words, we're currently doing:

\begin{align}
x_{\text{target box}} &= s_x x_{\text{source box}} \\
y_{\text{target box}} &= s_y y_{\text{source box}} \\ 
w_{\text{target box}} &= s_x w_{\text{source box}} \\
h_{\text{target box}} &= s_y h_{\text{source box}}
\end{align}

where

\begin{align}
s_x &= \frac{w_{\text{target frame}}}{w_{\text{source frame}}} \\
s_y &= \frac{h_{\text{target frame}}}{h_{\text{source frame}}}
\end{align}

But, with a letterbox on the source frame, we would need:

\begin{align}
x_{\text{target box}} &= s_x (x_{\text{source box}} - w_{\text{source letterbox}}) \\
y_{\text{target box}} &= s_y (y_{\text{source box}} - h_{\text{source letterbox}}) \\
w_{\text{target box}} &= s_x w_{\text{source box}} \\
h_{\text{target box}} &= s_y h_{\text{source box}}
\end{align}

where

\begin{align}
s_x &= \frac{w_{\text{target frame}}}{w_{\text{source frame}} - 2 w_{\text{source letterbox}}} \\
s_y &= \frac{h_{\text{target frame}}}{h_{\text{source frame}} - 2 h_{\text{source letterbox}}}
\end{align}

kevinsbarnard commented 11 months ago

As discussed in VARS meeting 11/15/2023, we are ignoring the letterbox case

kevinsbarnard commented 11 months ago

Fixed in v0.5.0; localizations are now rescaled to the source video dimensions in Sharktopoda 2. I've also added a warning that will appear when the source/target aspect ratios are different to help address the case above.

mbari-org / vars-gridview

Rescale localizations for Sharktopoda 2 #39