Parskatt / RoMa

[CVPR 2024] RoMa: Robust Dense Feature Matching; RoMa is the robust dense feature matcher capable of estimating pixel-dense warps and reliable certainties for almost any image pair.
https://parskatt.github.io/RoMa/
MIT License
434 stars 33 forks source link

Finding matches with arbitrary query points #24

Open justachetan opened 3 months ago

justachetan commented 3 months ago

Hi @Parskatt ,

Any ETA on when the demo for matching arbitrary keypoints will be released? The README says that it is possible and the demo will be released soon.

Is there any function in the current codebase that can be directly used to match arbitrary query points? If yes, I would be thankful for a pointer to the same.

Thanks!

Yours sincerely, Aditya

Parskatt commented 3 months ago

No demo yet, but see here:

https://github.com/Parskatt/RoMa/blob/50522299a55efc14ed892caaca9e29a1c8b73e12/roma/models/matcher.py#L545

Note: assumes keypoint coords in normalized [-1,1] grid, and we use colmap conventions for pixel coordinates (i.e. you might want to do +0.5 if using superpoint e.g.)

justachetan commented 3 months ago

Thanks, could you please share what the expected shapes of x_A and x_B? Are they expected to be the number of samples x 2?

Parskatt commented 3 months ago

Thanks, could you please share what the expected shapes of x_A and x_B? Are they expected to be the number of samples x 2?

Yes, and it seems I didn't implement it for batched versions, judging by the [None,None] in the grid_sample.

justachetan commented 3 months ago

Thanks for the quick response! Also, could you please answer the following questions:

  1. Are x_A and x_B lists of key points that are not currently in a 1-1 correspondence, i.e., x_A[i] can the correct match for x_B[j] for any 0 <= j <= len(x_B)?
  2. When computing x_A_to_B in the above function, why do we only take the last two indices along the last axis of warp? In other words, why are we using warp[...,-2:] instead of the complete warp? As per the formulation in the paper (Eq. 9), should we not be using the entire warp to obtain x_A_to_B?

Thanks in advance!

Parskatt commented 3 months ago

Thanks for the quick response! Also, could you please answer the following questions:

  1. Are x_A and x_B lists of key points that are not currently in a 1-1 correspondence, i.e., x_A[i] can the correct match for x_B[j] for any 0 <= j <= len(x_B)?
  2. When computing x_A_to_B in the above function, why do we only take the last two indices along the last axis of warp? In other words, why are we using warp[...,-2:] instead of the complete warp? As per the formulation in the paper (Eq. 9), should we not be using the entire warp to obtain x_A_to_B?

Thanks in advance!

  1. Correct, bur should work eother way.
  2. Internally I've used a 4D vector to represent warp (input coord, output coord) sometimes thats nice if you want to convert to flow etc. But -2: just means that we take the mapping, not input.
justachetan commented 2 months ago

Hi @Parskatt ,

I had a follow-up question about what warp represents. I can see that it is a tensor of dimensions H x 2W x 4 (where [H,W] are the image dimensions). I understood from your previous comment that the last dimension represents the input coordinates (first two indices) and their mapping in the target image (last two indices), but could you please explain why the size of the second dimension is 2W? Should it not be W as per the input?

Also thanks for your replies so far!

Parskatt commented 2 months ago

The 2W comes from symmetric warp, :W is from A to B and W: is from B to A.

Dense matchers are typically asymmetrical, so getting the symmetric warp requires running the refinement twice. You can toggle this by model.symmetric = True/False

nnop commented 2 months ago

Should the input warp shape be (H, W, 4) for grid_sample instead of (H, 2*W, 4)? @Parskatt

Parskatt commented 2 months ago

The implementation for keypoints currently assumes asymmetrical, but could be made symmetric. I dont think its obvious what the best way to do symmetric kpt matching with dense warps is tho.

ilay-chen commented 3 weeks ago

Hi, did someone manage to use match_keypoints? I didn't understand what exactly the input and output of this function... I generally want to match point (x, y) in image A to image B and get the corresponding point... Thank you!