facebookresearch / OrienterNet

Source Code for Paper "OrienterNet Visual Localization in 2D Public Maps with Neural Matching"
Other
424 stars 40 forks source link

Potential label leakage issue due to tile stitching in SD map #48

Open sunnyykk opened 4 months ago

sunnyykk commented 4 months ago

Hello! I'm truly thankful for the insights presented in your paper.

While studying this outstanding work, I noticed that you implemented a tiling process in lines 108 to 125. However, when reassembling the tiled rasters back into a single image, there may be discrepancies at the seams compared to the original image. This could be due to the fact that, when a straight line is divided into segments, the end of the line might be prematurely rounded to the next pixel, resulting in a 1-pixel difference in the reassembled image.

The example image below illustrates the difference between the original 256x256 SD map and the reassembled image from four 128x128 sub-images that were initially split and then stitched back together.

image image

Of course, such discrepancies are usually negligible; however, there is an exception in the following scenario: When I obtain the WGS84 ground truth for a 2D query image, I use this ground truth as the center to extract our SD map, setting the dimensions to 256x256, while keeping the tile_size at the default value of 128.

So the tile_manager splits the tile into four parts right along the coordinates of the ground truth. Later, when we randomly select a 128x128 bounding box on this 256x256 SD map and call this function to obtain the canvas.raster for training, the model, interestingly, accurately recognizes that the seams on these maps may reveal the true position of the GT. Consequently, our model experiences significant label leakage🤣!

Below is the visualization. Observe the cross lines at the GT location on the neural map.

image

Therefore, my conclusion is: The process of segmenting and then reassembling the SD map leaves scars on the map that are difficult to heal, and although they are minor, they still exhibit certain features that can be learned. If these scars happen to coincide with the ground truth or original GPS coordinates when creating the dataset, it might enable the model to directly identify the leaked labels on the raster or interfere with the sensitivity to the GPS priors.

sarlinpe commented 4 months ago

Wow very clear and detailed investigation, thank you! This is indeed a serious and tricky problem. I think that I never encountered it in any of my experiments because I always performed the tiling on a much larger area defined by multiple images, such that the seams of the tile never coincide with GT locations. The tile manager was not really designed to be defined per image.

  1. In your case, an easy fix I can think of is to set the tile size to 256, such that not tiling occurs. You could then still query 128x128 tiles at training time.
  2. To remedy the problem at the source: is the 1-pixel difference due to a bug in cv2.polylines? (called here) We don't explicitly divide lines into segments, we just let OpenCV draw outside the canvas - its boundary calculation is maybe buggy?
sunnyykk commented 4 months ago

Thank you for confirming and for the suggestions. After further investigation:

The pixel discrepancies at the seams are indeed not caused by the UV coordinates, but rather by the drawing induced by the cv2.polylines function. When I replaced cv2 with PIL for line drawing, it became very continuous.

Below is a minimal code example to replicate the issue. I rendered a straight line across the full canvas and contrasted it with a line drawn on segmented tiles that were later assembled, employing green and red to represent, respectively.

image image
import cv2
import numpy as np
from PIL import Image, ImageDraw
from maploc.osm.raster import Canvas
from maploc.utils.geo import BoundaryBox

# line_xy = np.array([[-80, 25],
#                     [90, -20]])
# line_xy = np.array([[-85, 25],
#                     [90, -20]])
# line_xy = np.array([[-80, 27],
#                     [90, -20]])
line_xy = np.array([[-88.123, 25.4312],
                    [90.324, -23.789]])

class Canvas(Canvas):
    def draw_line(self, xy: np.ndarray, width: float = 1):
        uv = self.to_uv(xy)

        # cv2.polylines(self.raster, uv[None].round().astype(np.int32), False, 255, thickness=width)   

        # or

        draw = ImageDraw.Draw(Image.fromarray(self.raster))
        x1, y1, x2, y2 = uv[0].round().astype(np.int32).tolist() + uv[1].round().astype(np.int32).tolist()
        draw.line((x1, y1, x2, y2), fill=255, width=width)
        self.raster = np.array(draw._image)

ppm = 1
bbox_tile_all = BoundaryBox(np.array([-100, -100]), np.array([100, 100]))

canvas_all = Canvas(bbox_tile_all, ppm)
canvas_all.draw_line(line_xy)
ori_raster = canvas_all.raster

bbox_tiles = []
bbox_tiles.append(BoundaryBox(np.array([-100, -100]), np.array([0, 100])))
bbox_tiles.append(BoundaryBox(np.array([0, -100]), np.array([100, 100])))

res = []
for bbox_tile in bbox_tiles:
    canvas = Canvas(bbox_tile, ppm)
    canvas.draw_line(line_xy)
    res.append(canvas.raster)

concat_raster = np.concatenate(res, axis=1)
concat_raster = np.roll(concat_raster, 3, axis=0)
comp_stack = np.stack([np.zeros_like(ori_raster), ori_raster, concat_raster], axis=-1)
cv2.imwrite('line.jpg', comp_stack)
sarlinpe commented 4 months ago

Thank you for the perfect minimal reproduction! Another OpenCV bug then... We could switch all drawing functions to PIL.ImageDraw. Since it doesn't operate on a numpy array, we would create a new drawing object and make Canvas hold only the final raster (maybe renaming the class to Tile). You're welcome to contribute if you want.

cc @AlanSavio25

sunnyykk commented 3 months ago

Thank you for your friendly response! I am very interested in making contributions if possible!