kornia / kornia

Geometric Computer Vision Library for Spatial AI
https://kornia.readthedocs.io
Apache License 2.0
10.03k stars 975 forks source link

Inconsistent conventions in function names #2781

Open ducha-aiki opened 10 months ago

ducha-aiki commented 10 months ago

Describe the bug

Here the points go second https://kornia.readthedocs.io/en/latest/geometry.linalg.html#kornia.geometry.linalg.transform_points

Here first https://kornia.readthedocs.io/en/latest/geometry.camera.perspective.html

Reproduction steps

https://twitter.com/Parskatt/status/1754501114777456916

Expected behavior

Not be confusing.

Environment

N/a

Additional context

No response

edgarriba commented 10 months ago

what is the confusion here ? even though you can use one to solve the other, in principle the functions in principle are unrelated.

The first, applies a generic transformation to coordinates system of a set points given the mapping transformation

points_in_robot = transform_points(robot_from_world, points_in_world) 

the second has a specific purpose to project a set of 3d points to the z1 image plane given the calibration parameters of an affine model. The last is an old function that's is planned to be replaced by the new cameras api (based on sophus conventions): https://kornia.readthedocs.io/en/latest/geometry.camera.html

ducha-aiki commented 10 months ago

Well, it would be really simple and great, if we use same idea - points first, or points last, across all the library. Otherwise, one has to consistently look at the documentation. Old or not, it doesn't matter.

Parskatt commented 9 months ago

https://twitter.com/Parskatt/status/1757423036771254499

Parskatt commented 9 months ago

This inconsistency makes it confusing to use. To me, projection is P: 3x4, intrinsics are K: 3x3, and pose is R: 3x3 ,t: 3x1 alt. T: 4x4

unproject_points should take K + image_coords + depth -> 3D points transform_points should take 3D points + either (R, t) or T -> 3D points projection_from_KRt takes K + R + t -> P project_points should take 3D points + P -> 2D points

For me this is by far the most common usecase of these functions. Am I supposed to be creating PinholeCamera objects? To me the functional form is much easier to read.

I mean I can just write these functions myself instead, but I'd like to be able to use standardized stuff.

edgarriba commented 9 months ago

@Parskatt thanks for the feedback.

The unproject_points / project_points were originally inspired by the tf.graphics.camera api and possibly the new way to use hem will be following the new Kornia apis here for Projections / Distortion.

The new apis are coming from Sophus / sophus-rs and a bit of Pangolin that have years of iterations for production in known companies and we use internally in our group for camera calibration and slam stuff. Possibly kornia-rs will adopt some of this too since the idea is to converge more and more to the Sophus api.

projection_from_KRt is part of the epipolar geometry package that's is actually a port from OpenCV SfM module which i contributed many years ago and actually coming from Libmv (backend mv library for Blender).

transform_points was kind the first kornia (torchgeometry at Arraiy times) function meant to be generic points coordinate system operator.

That's the context with all the why's for this apis -- said that, i'm totally happy that we define together a standard apif for the calibration / geometry api that makes sense for kornia context and whom we useit everyday. If you like we can open a design docs and discuss there the apis. Keep in mind backward compatibilities and such based on our new releases schedule.

/cc @cjpurackal since he's involved too with the new sensors and lie api

ducha-aiki commented 9 months ago

@edgarriba exactly, we have inconsistent inspirations, and we should unify them. Users don't care about the history

Parskatt commented 9 months ago

While I'm at it, why is this: Screenshot_20240214_115137_Chrome.jpg

Composed on the left? I expected it would be on the right.

edgarriba commented 9 months ago

Composed on the left? I expected it would be on the right.

that was an arbitrary choice i guess when that function was created. Depending on the application you would like in one or the other side, I'd be in favor in deprecating that and just teaching our users to use as follows:

these all should be equivalent (rigid body transform and points conventions are explained here)

t1_from_t2 = compose_transform(t1_from_t0, t0_from_t2)
t1_from_t2 = compose_transform(t0_from_t1.inverse(), t0_from_t2)
t1_from_t2 = compose_transform(t1_from_t0, t2_from_t0.inverse())
t1_from_t2 = compose_transform(t0_from_t1.inverse(), t2_from_t0.inverse())
t1_from_t2 = compose_transform(t2_from_t0, t1_from_t0.inverse()).inverse()

where compose_transform is just a convenient/tested function that handles matmul and batching if existing. That's my plan at least for kornia-rs which eventually will be the backend for small rank matrices. Pytorch is extremely inefficient at that link1 link2