SMPTE / ris-osvp-metadata-camdkit

Implements the SMPTE RIS OSVP camera metadata model
BSD 3-Clause "New" or "Revised" License
40 stars 6 forks source link

Allow the entrance pupil position to be positive or negative #95

Closed palemieux closed 1 year ago

palemieux commented 1 year ago

Add Rational parameter type.

Closes #91

repentsinner commented 1 year ago

What coordinate system handedness do we want to use for this, and what does OpenCV use for optical axis +ve/-ve? Presumably many downstream consumers will be OpenCV based at least in the short term.

palemieux commented 1 year ago

What coordinate system handedness do we want to use for this, and what does OpenCV use for optical axis +ve/-ve? Presumably many downstream consumers will be OpenCV based at least in the short term.

Yes, the exact question to answer.

repentsinner commented 1 year ago

A data point from USD:

Cameras in USD are always "Y up", regardless of the stage's orientation (i.e. UsdGeomGetStageUpAxis()). This means that the inverse of 'camXform' (the VIEW half of the MODELVIEW transform in OpenGL parlance) will transform the world such that the camera is at the origin, looking down the -Z axis, with +Y as the up axis, and +X pointing to the right. This describes a right handed coordinate system.

from OpenCV:

image

Par for this course, they appear to me to share the optical axis (z) and have opposing signs 🙃

JGoldstone commented 1 year ago

My original intent, because I'm so used to OpenEXR, that we would mimic its coordinate system; but what I had been thinking of as the appropriate coordinate system was the screen window coordinate system, left-handed with x-axis to the right and y up. On closer examination because of this PR, and looking at the 2023 version of ST 2065-4, I see that the OpenEXR camera coordinate system is different from the OpenEXR screen window coordinate system; cameraPosition is in a coordinate system that is right-handed with z up and y pointing into the scene.

But on reflection, the camera coordinate system in a particular transport mechanism's definition (OpenEXR's) is less appropriate than a coordinate system established for more general uses. I would argue for USD has the most momentum right now for transporting geometry across application boundaries, and that's where we should go when we have to use a 3D cartesian coordinate system.

That said this is one reason why I have advocated entrancePupilOffset rather than entrancePupilPosition. No matter what the axis along which the camera is looking happens to be called, a positive number for entrance pupil offset means the entrance pupil is in front of the imaging plane along the line segment passing through (speaking roughly) the sensor center to the object in focus, and a negative number means the entrance pupil is behind the sensor along that same line. Is that coordinate-system-independence worth changing our notion of entrance pupil position to entrance pupil offset?

JGoldstone commented 1 year ago

I would like to see the signed rational type be more congruent to OpenEXR and, I believe, C++, with a denominator that was unsigned and ranged within [0 ... 4,294,967,295].

palemieux commented 1 year ago

[0 ... 4,294,967,295].

0 needs to be excluded.

palemieux commented 1 year ago

@JGoldstone Does this work better?

https://github.com/SMPTE/ris-osvp-metadata-camdkit/pull/95/commits/30902d138360f118ffb24d906d93ef1eed2d9cbc

JGoldstone commented 1 year ago

Agreed.

That OpenEXR lets one use a zero in the denominator and MIN_INT or MAX_INT in the numerator as an alternate way of representing negative and positive infinity, respectively, is not useful when we've all gotten used to using std::numeric_limits<> to get this; same for Nan. What you propose is closer to C++ and I very much approve.

repentsinner commented 1 year ago

Note to check against Cooke /i, Arri, Zeiss formats.

JGoldstone commented 1 year ago

The ARRI definition is this. The -1 value was an accidental carryover from another field, and should not be there.

image

We use the term 'reference plane' in that definition. Its definition is here:

image

In the Cooke "Camera and Lens Definitions for VFX" paper "image plane" is used, not "reference plane".

image

Note that Cooke uses 'position' to indicate, not a place in space, but a distance along the optical axis.

repentsinner commented 1 year ago

@JGoldstone does the note to entry of reference plane imply that the reference plane is always at the front of the thin optical elements when present? Or should we be saying something like "plane perpendicular to the optical axis corresponding to where the image is (optimally?) formed" and let the sensor (and associated, likely proprietary, optical element stack of OLPF, hot mirrors, micro lenses, etc) land where is optimal relative to that plane?

jamesmosys commented 1 year ago

This is Mo-Sys' definition for reference - but happy to change it! (I agree from the discussion that CCD is not the correct term - this is historic for us.)

Zepd is the entrance pupil distance (offset from CCD to the point of no parallax). Note how +Zepd brings objects closer to the camera.

The sign was chosen to match the tracker's coordinate system.

JGoldstone commented 1 year ago

@repentsinner thanks for pointing that out. I suppose there is some variation but at least on all the cameras we make, the imaging plane is between the front surface of the sensor and the foremost surface of the entire 'sandwich'. Below (from Wikipedia's article on OLPFs) is an image showing how an image might be formed; ignoring the lack of microlens array or color filter array material, one can say the imaging plane is between the sensor surface and the front surface of the OLPF.

image

I have a couple of questions. First, does the Plimsoll mark correspond to the imaging plane? Second, is the distance between the point where the optical axis intersects the plane that contains the unshimmed face of the camera lens mount and where the optical axis intersects the imaging plane equal to the flange focal depth? I will ask my friend in our Optics group.

repentsinner commented 1 year ago

Thanks for elaborating @JGoldstone. I think my confusion stems from the "between" part of things - somewhere in that optical sandwich (which has some tangible depth) is the imaging plane. For our purposes, how do we determine where in that depth it is? I know Dave Stump mentioned a device to set the flange face relative to what I assumed to be the front face of the stack, but if it's got to be then offset some amount from the front of the stack (measurable location) to some point inside the stack, that seems more challenging.

All of the Plimsoll marks I've seen appear to be engraved/painted at a resolution that seems way too low relative to the stack depth, and seem more likely useful for measuring to the focus plane (? the plane out in the world, not the imaging plane), not for setting the flange-to-image-plane distance. Not sure if that is what you were getting at. Presumably shim distances are in the ~0.01mm range which is in the noise when measuring from Plimsoll (imaging plane) to subject, but meaningful when measuring imaging plane to flange.

Per your second question, my understanding is that the flange focal distance would include or be corrected by the shims in systems that support shimming (or other adjustment mechanisms). On PL, we want 52.00mm, and shim accordingly to achieve that? I believe this is why we wanted to include the flange focal distance as part of this system - the flange is the one place that both the lens and body agree on.

JGoldstone commented 1 year ago

@repentsinner I didn't mean to imply the Plimsoll mark was something against which one would make measurements; what I said was more in the sense of "well, if the Plimsoll mark doesn't correspond to the front of the sensor, then to what does it correspond?"

My optics guy may have once said something like "the imaging plane is the image of the sensor", in which case, one could focus on it. I think. But I will ask.

And I would like better to understand any effect behind-the-lens filters (like ARRI Impressions filters) and lens extenders might have on flange focal distance, i.e. whether one ends up shimming in some or all cases. I did find an internal document on this (yay) but the optical terminology it is using — in German — is way beyond me (boo).

I know I've quoted him at least once before in this group, but I still love this, from Thor Olson: "Education is the process of telling smaller and smaller lies".

palemieux commented 1 year ago

Below is the proposed convention based on a discussion with @JGoldstone image

repentsinner commented 1 year ago

I like the 'nominal imaging plane' nomenclature as it nicely sidesteps us having to define anything about an implementation-dependent sensor optical stack.

I might recommend 'entrance pupil distance' instead of 'entrance pupil position' for this particular definition as most colloquially use distance as a 1D term and position as a 2D/3D term, but that's just me being pedantic.

Presumably in any 1D case we could get into trouble if we start to consider that the entrance pupil may not actually be on the optical axis due to design, manufacturing, or use/assembly tolerances (e.g., I am unclear if the entrance pupil moves off the optical axis if the center shifts due to zoom in a loose tolerance design or the whole axis shifts?).

JGoldstone commented 1 year ago

Could we please have 'entrance pupil offset' to match the entrancePupilOffset attribute recently defined in OpenEXR 3.2 (which will be part of the VFX Reference Platform for CY2024)?

https://github.com/AcademySoftwareFoundation/openexr/blob/159e57120dae5e587454e0d4bd916c792831a8ce/src/lib/OpenEXR/ImfStandardAttributes.h#L464

palemieux commented 1 year ago

Could we please have 'entrance pupil offset' to match the entrancePupilOffset attribute recently defined in OpenEXR 3.2 (which will be part of the VFX Reference Platform for CY2024)?

Let's fix it here first: https://github.com/SMPTE/ris-osvp-metadata/issues/13

JGoldstone commented 1 year ago

@palemieux agreed that it should be fixed in the definitions before being fixed in camdkit. I have just added a comment in the issue you linked there, addressing the @bgschunck suggestion while, I hope, keeping the text concrete enough so that OSVP users will readily find it useful.