kornia / kornia

Geometric Computer Vision Library for Spatial AI
https://kornia.readthedocs.io
Apache License 2.0
9.79k stars 956 forks source link

Changing dtype of keypoints could cause them to shift #2947

Open lahavlipson opened 2 months ago

lahavlipson commented 2 months ago

Describe the bug

This isn't exactly a "bug" but...

In the Keypoints class, the merge_with_descriptors method changes the dtype of the keypoints to that of the descriptors. If the descriptors were extracted with a low-precision dtype (e.g., float16 or bfloat16), this casting would cause keypoints extracted from megapixel images to shift by up to a full pixel.

https://github.com/kornia/kornia/blob/bdeac07e1edc26863a9c8d0826dc202974fd850a/kornia/feature/disk/structs.py#L82

Reproduction steps

disk = KF.DISK.from_pretrained("depth").to("cuda")
with torch.cuda.amp.autocast():
    features = disk(megapixel_image, 1024)
print(features[0].keypoints.dtype, features[0].descriptors.dtype)

### Expected behavior

It might make sense to cast them to float32 or just leave their dtype unchanged

### Environment

```shell
N/A

Additional context

"Bug" found in commit bdeac0

edgarriba commented 1 month ago

@ducha-aiki