Closed praeclarum closed 2 years ago
This is rather unfortunate, and I'm sorry we got this wrong.
Just to make sure I understand correctly, a potential fix would be to change this:
to:
result.Column3 = new SCNVector4 (x, y, z, 1);
(and the equivalent change in the result of the SCNMatrix4
class).
Great bug report btw!
Yes, exactly. Such changes to the construction methods would make them column-vector transformers. (Translation in the last column is how I always remember column vectors).
Unfortunately, there are a lot of those construction methods. The Transform
methods also have to be changed (they’re doing a left multiply right now).
Matrix multiply should be fine because it’s universal. But I have seen some matrix multiplies that ignore the skew elements of the matrix and that would make it different (because it’s not doing the full 4x4 multiply).
So happy my bug report made sense to you. Sorry it was so late, I didn’t know SceneKit was being modified.
I've looked a bit more into this, and oh dear Apple, they haven't made things easy.
Step 1 is to try to answer the question: "does SCNMatrix4 really use a column-major layout?"
There's nowhere in the headers that say whether SCNMatrix4
is row-major or column-major. There's no API that returns/works with rows or columns, they all act on the entire matrix. The only hint I could find was the layout of the definition:
typedef struct SCNMatrix4 {
float m11, m12, m13, m14;
float m21, m22, m23, m24;
float m31, m32, m33, m34;
float m41, m42, m43, m44;
} SCNMatrix4;
This kind of implies row-major (it "looks that way"), but it's not explicit stated.
The simd_float4x4
struct is column major, according to how it's defined:
typedef struct { simd_float4 columns[4]; } simd_float4x4;
This looks pretty obvious.
The conversion from simd_float4x4
to SCNMatrix4
clearly says that they're identical (either both row-major or column-major), because it's just blitting memory:
NS_INLINE SCNMatrix4 SCNMatrix4FromMat4(simd_float4x4 m) {
return *(SCNMatrix4 *)&m;
}
The opposite conversion (from SCNMatrix4
to simd_float4x4
) makes it clear they're both column-major:
NS_INLINE simd_float4x4 SCNMatrix4ToMat4(SCNMatrix4 m) {
return (simd_float4x4){
.columns[0] = simd_make_float4(m.m11, m.m12, m.m13, m.m14),
.columns[1] = simd_make_float4(m.m21, m.m22, m.m23, m.m24),
.columns[2] = simd_make_float4(m.m31, m.m32, m.m33, m.m34),
.columns[3] = simd_make_float4(m.m41, m.m42, m.m43, m.m44)
};
}
While there's no obvious SCNMatrix4
API telling us whether SCNMatrix4
is row-major or column-major, it has an effect on matrix multiplication.
Say we have two matrices:
$$A=\begin{bmatrix}1 & 0 & 0 & 2\\0 & 1 & 0 & 0\\0 & 0 & 1 & 0\\0 & 0 & 0 & 1\end{bmatrix}$$
and
$$B=\begin{bmatrix}3 & 0 & 0 & 0\\0 & 1 & 0 & 0\\0 & 0 & 1 & 0\\0 & 0 & 0 & 1\end{bmatrix}$$
multiplying these two gives us:
$$AB=\begin{bmatrix}3 & 0 & 0 & 2\\0 & 1 & 0 & 0\\0 & 0 & 1 & 0\\0 & 0 & 0 & 1\end{bmatrix}$$
now if the input matrices was transposed (i.e. switched between row-major and column-major), then we'd get this result:
$$X=\begin{bmatrix}1 & 0 & 0 & 0\\0 & 1 & 0 & 0\\0 & 0 & 1 & 0\\2 & 0 & 0 & 1\end{bmatrix}$$
$$Y=\begin{bmatrix}3 & 0 & 0 & 0\\0 & 1 & 0 & 0\\0 & 0 & 1 & 0\\0 & 0 & 0 & 1\end{bmatrix}$$
$$XY=\begin{bmatrix}3 & 0 & 0 & 0\\0 & 1 & 0 & 0\\0 & 0 & 1 & 0\\6 & 0 & 0 & 1\end{bmatrix}$$
Now let's create a translation SCNMatrix4:
SCNMatrix translate = SCNMatrix4MakeTranslation (2, 0, 0);
and we get back the identity matrix, except that m41=2
:
SCNMatrix4(m11: 1.0, m12: 0.0, m13: 0.0, m14: 0.0, m21: 0.0, m22: 1.0, m23: 0.0, m24: 0.0, m31: 0.0, m32: 0.0, m33: 1.0, m34: 0.0, m41: 2.0, m42: 0.0, m43: 0.0, m44: 1.0)
and let's create a scale SCNMatrix4:
SCNMatrix scale = SCNMatrix4MakeScale (3, 1, 1);
and this is the identity matrix, except that m11=3
:
SCNMatrix4(m11: 3.0, m12: 0.0, m13: 0.0, m14: 0.0, m21: 0.0, m22: 1.0, m23: 0.0, m24: 0.0, m31: 0.0, m32: 0.0, m33: 1.0, m34: 0.0, m41: 0.0, m42: 0.0, m43: 0.0, m44: 1.0)
Note that the scale matrix is the same for both row-major and column-major layouts.
Finally let's multiply these two:
SCNMatrix transform = SCNMatrix4Mult (translate, scale);
and we get back:
SCNMatrix4(m11: 3.0, m12: 0.0, m13: 0.0, m14: 0.0, m21: 0.0, m22: 1.0, m23: 0.0, m24: 0.0, m31: 0.0, m32: 0.0, m33: 1.0, m34: 0.0, m41: 6.0, m42: 0.0, m43: 0.0, m44: 1.0)
If SCNMatrix4
uses row-major layout, that would be:
$$rowTransform=\begin{bmatrix}3 & 0 & 0 & 0\\0 & 1 & 0 & 0\\0 & 0 & 1 & 0\\6 & 0 & 0 & 1\end{bmatrix}$$
And if SCNMatrix4
uses column-major layout, that would be:
$$colTransform=\begin{bmatrix}3 & 0 & 0 & 6\\0 & 1 & 0 & 0\\0 & 0 & 1 & 0\\0 & 0 & 0 & 1\end{bmatrix}$$
Now let's say SCNMatrix4
uses column-major layout, it would mean the translate
matrix would be:
$$colTranslate=\begin{bmatrix}1 & 0 & 0 & 2\\0 & 1 & 0 & 0\\0 & 0 & 1 & 0\\0 & 0 & 0 & 1\end{bmatrix}$$
That's the A
matrix from above, which multiplied with the scale
matrix gives the AB
matrix from above, which is neither the row-major (rowTransform
) nor the column-major (colTransform
) interpretation of the multiplication results.
So let's try the other way, assume SCNMatrix4
uses row-major layout, it would mean the translate
matrix would be:
$$rowTranslate=\begin{bmatrix}1 & 0 & 0 & 0\\0 & 1 & 0 & 0\\0 & 0 & 1 & 0\\2 & 0 & 0 & 1\end{bmatrix}$$
That's the X
matrix from above, which multiplied with the scale
matrix gives the XY
matrix from above, which is exactly the transform
matrix when we assume SCNMatrix4
is using row-major layout (rowTransform
).
In the end I guess I've just proved that:
$$(A^t)(B^t) ≠ (AB)^t$$
So there seems to be a mismatch between what SCNMatrix4
and simd_float4x4
do.
Let's try converting the SCNMatrix4
matrices we have to simd_float4x4
and multiply those types instead.
let translate = SCNMatrix4MakeTranslation (2, 0, 0)
let scale = SCNMatrix4MakeScale (3, 1, 1)
let transform = SCNMatrix4Mult (translate, scale)
let translateSimd = float4x4.init (translate)
let scaleSimd = float4x4.init (scale)
let transformSimd = translateSimd * scaleSimd
print (transform)
print (transformSimd)
that results in:
SCNMatrix4(m11: 3.0, m12: 0.0, m13: 0.0, m14: 0.0, m21: 0.0, m22: 1.0, m23: 0.0, m24: 0.0, m31: 0.0, m32: 0.0, m33: 1.0, m34: 0.0, m41: 6.0, m42: 0.0, m43: 0.0, m44: 1.0)
and
simd_float4x4([[3.0, 0.0, 0.0, 0.0], [0.0, 1.0, 0.0, 0.0], [0.0, 0.0, 1.0, 0.0], [2.0, 0.0, 0.0, 1.0]])
There's no need to study these results much, in the SCNMatrix4
you can see that m41=6.0
, but the value 6.0
never shows up in the simd_float4x4
results.
Maybe there's something weird about how SCNMatrix4Mult
works
Maybe the SCNMatrix4Mult
parameters could be reversed somehow? Reading the documentation, it clearly states that the first argument is the left operand, and the second argument is the right operand, so no luck there.
It also says: "As a transformation, the result of multiplying a matrix A by a matrix B is the transformation represented by B followed by the transformation represented by A."
Which is how composing matrices work with column vectors:
$$B(Ax) = (BA)x$$
Let's test this with SCNNode
:
let test = SCNNode()
let translate = SCNMatrix4MakeTranslation (2, 0, 0)
let scale = SCNMatrix4MakeScale (3, 1, 1)
test.transform = SCNMatrix4Mult (translate, scale);
print(test.position)
SCNVector3(x: 6.0, y: 0.0, z: 0.0)
OK, this means the translation is applied first, then scaled (x
starts off at 0
, we add 2
, and then multiply by 3
). The opposite of what the documentation said...
Let's try reverting the arguments to SCNMatrix4
:
let test = SCNNode()
let translate = SCNMatrix4MakeTranslation (2, 0, 0)
let scale = SCNMatrix4MakeScale (3, 1, 1)
test.transform = SCNMatrix4Mult (scale, translate);
print(test.position)
SCNVector3(x: 2.0, y: 0.0, z: 0.0)
This is the reverse: scaled first, then translated - once again the opposite of what the documentation for SCNMatrix4Mult
says.
For the first case (first translate, then scale), this is the resulting SCNMatrix4
:
SCNMatrix4(m11: 3.0, m12: 0.0, m13: 0.0, m14: 0.0, m21: 0.0, m22: 1.0, m23: 0.0, m24: 0.0, m31: 0.0, m32: 0.0, m33: 1.0, m34: 0.0, m41: 6.0, m42: 0.0, m43: 0.0, m44: 1.0)
Assuming column vectors, the only way for this matrix to transform the $\begin{bmatrix}0 & 0 & 0\end{bmatrix}$ position to $\begin{bmatrix}6 & 0 & 0\end{bmatrix}$ would be if the matrix was using column-major layout:
$$colTransform=\begin{bmatrix}1 & 0 & 0 & 6\\0 & 1 & 0 & 0\\0 & 0 & 1 & 0\\0 & 0 & 0 & 1\end{bmatrix}$$
So this is another point towards SCNMatrix4
being column-major.
Maybe the documentation for SCNMatrix4Mult
utterly wrong?
Let's try comparing multiplication again, but reversing the arguments to SCNMatrix4Mult
:
let translate = SCNMatrix4MakeTranslation(2, 0, 0)
let scale = SCNMatrix4MakeScale (3, 1, 1)
let translateSimd = float4x4.init (translate)
let scaleSimd = float4x4.init (scale)
let transform = SCNMatrix4Mult (translate, scale)
let transformSimd = scaleSimd * translateSimd
print (transform)
print (transformSimd)
SCNMatrix4(m11: 3.0, m12: 0.0, m13: 0.0, m14: 0.0, m21: 0.0, m22: 1.0, m23: 0.0, m24: 0.0, m31: 0.0, m32: 0.0, m33: 1.0, m34: 0.0, m41: 6.0, m42: 0.0, m43: 0.0, m44: 1.0)
and
simd_float4x4([[3.0, 0.0, 0.0, 0.0], [0.0, 1.0, 0.0, 0.0], [0.0, 0.0, 1.0, 0.0], [6.0, 0.0, 0.0, 1.0]])
Bingo! These matrices are identical (if they're both column-major)!
a) SCNMatrix4
is column-major.
b) SCNMatrix4Mult
reverses the argument, and does a $BA$ multiplication instead of $AB$.
c) The documentation for SCNMatrix4Mult
is very wrong.
Here's another argument for SCNMatrix4
being column-major:
GLKMatrix4
can easily be determined to be column-major by creating a new GLKMatrix4
with the GLKMatrix4MakeWithColumns
method and print the first four numbers in the struct:
GLKMatrix4 glkm = GLKMatrix4MakeWithColumns (
GLKVector4Make(1.0, 1.0, 1.0, 1.0),
GLKVector4Make (2.0, 2.0, 2.0, 2.0),
GLKVector4Make (3.0, 3.0, 3.0, 3.0),
GLKVector4Make (4.0, 4.0, 4.0, 4.0));
NSLog (@"%f %f %f %f", glkm.m00, glkm.m01, glkm.m02, glkm.m03);
1.000000 1.000000 1.000000 1.000000
Then we create a SCNMatrix4
with the first four numbers set to the first column in the GLKMatrix4
, convert that to a GLKMatrix4
, and then we print the result:
SCNMatrix4 scn = SCNMatrix4Identity;
scn.m11 = 1.0;
scn.m12 = 1.0;
scn.m13 = 1.0;
scn.m14 = 1.0;
GLKMatrix4 glkm2 = SCNMatrix4ToGLKMatrix4(scn);
NSLog (@"%f %f %f %f", glkm2.m00, glkm2.m01, glkm2.m02, glkm2.m03);
1.000000 1.000000 1.000000 1.000000
Thank you for the analysis! I did mean to do some C/Swift examples to prove out the row/column ordering. Yeah, I’m pretty sure they’re column-major also, but I didn’t have facts.
b) SCNMatrix4Mult reverses the argument, and does a BA multiplication instead of AB.
Well that’s terrifying!!! Oh dear…
As you discovered, experiments do seem to be the only way to prove this stuff out. I think the SCNNode tests are the most important. Once you get Node working, it’s pretty easy to get the matrix self-consistent.
I wrote the next section without fully noting that you said you’re on Step 1 of solving this bug. I’ll leave it here but I understand you were just trying to answer 1 of the questions.
I was more concerned with he construction methods (rotation, scale, translation) and less about the ordering since the ordering can be verified pretty easily:
That procedure will pretty clearly tell you the memory order of the translation elements.
From there, we can decide if a left side vector multiplier or a right side multiplier:
The nice part of these tests is it allows you to answer these questions independent of our binding - Apple is dealing with the matrix 100% themselves. But their work can easily be read out of the class using the Position, Scale, Orientation properties.
Thanks for taking the deep dive. Can’t wait for this to get fixed!
@praeclarum I have a PR up that should fix this: #15160, could you review to see if it makes sense what I did?
I hope I got it right this time!
Short Version
TLDR; The recent change (https://github.com/xamarin/xamarin-macios/pull/13695) in .NET 6 to make
SCNMatrix4
column-major breaks theTransform
property ofSCNNode
. The change correctly identified thatSCNMatrix4
was bound incorrectly, but the solution to the problem is currently incomplete and makes the class behave in inconsistent ways.Long Version
Here is a video of me walking through discovering and explaining the problem: https://www.twitch.tv/videos/1491014116?t=00h15m47s
Linear Algebra Primer
In linear algebra, the order or multiplication is important.
If one wishes to use a matrix for transforming points, one must choose if points (vectors) will be multiplied on the left hand side of the matrix or on the right. If the vector is on the left, it must, by dimensional analysis, be a row-vector (shaped 1x4). If it is on the right, it must be a column vector (4x1).
This has repercussions in how the matrices are composed. If vectors are multiplied on the left (row vectors), then matrices should be composed so the left most operation happens first. If however, vectors are multiplied on the right (column vectors), then matrices should be composed so the first operation is on the right and the last is on the left.
The Column-major vs Row-major Debate
Regardless of what's in a matrix and how it was composed and intended to be used, it can be stored in either row or column major format. This decision is orthogonal to whether left or right vector multiplication will be used for actual transforms.
A History Lesson
There is a strange history in .NET to favor row-vectors over column vectors. I say strange because all of science, engineering, and pretty much every 3D engine in the universe uses column vectors.
SceneKit also uses column vectors. Please note that I am saying column vectors and not column-major element ordering. It does also use column-major ordering, but let's table that for the moment.
When SceneKit was bound 8 years ago,
SCNMatrix4
should have been implemented as a Column-vector Column-major matrix. Sadly, it was implemented as a Row-vector Row-major matrix.A Modern Tragedy
This mistake was caught early on, but it wasn't understood that the mistake was two-fold.
It was noticed that if you add a
Transpose
to the matrix before it is passed to SceneKit (SCNNode.Transform
for example) then the matrix acted the same way as Apple's matrices (that are column-vector, column-major).Sadly, the conclusion was that the mistake was purely due to the column-major, row-major mixup.
It wasn't recognized that the matrices were actually constructed incorrectly. Because the matrices have a Row-vector interface for construction (for example,
SCNMatrix4.CreateTranslation
, etc.) the order in which they were composed was backwards (compared to the Apple examples).For .NET 6, the matrices were correctly switched to be Column-major element ordering. https://github.com/xamarin/xamarin-macios/pull/13695
However the APIs were not fixed to make the matrices Column-vector transformers (they remain row-vector transformers). This puts them in a terribly awkward state where they act like row-vector transformers when used by themselves and with vectors. But when used with SceneKit that is expecting Column-vector, Column-major element ordering, they don't work at all.
To demonstrate this inconsistency I have written some unit tests. Allow me to explain them next.
Unit Tests to Demonstrate Inconsistencies
Demonstrate that SCNMatrix4 is Row-vector, Column-major
I wish to first demonstrate that the matrix class itself isn't broken, but is indeed constructed in a Row-vector manner. Let's first demonstrate that simple translation works fine:
That works. To demonstrate that the matrix assumes left-hand vector multiplication (which means the vectors have to be row vectors), let's compose two transformations, one after the other. We will do a translation followed by a scaling operation. This is a a good test because the order of these operations matters a lot and is very evident.
Note that the translation operation had to come before (on the left) the scale operation. This means it runs first, then the scale. This is proven out in the test because the X coordinate of the point is first translated then scaled:
X = (1 + T)*S
X = (1 + (-1))*10
X = 0*10
X = 0
If you instead put the scale on the left of the translation, then you note that the result is quite different.
X = 1*S + T
X = 1*10 + (-1)
X = 10 - 1
X = 9
We can see from this code and its results that the matrices are indeed constructed assuming row vectors. This is independent of the memory order of elements.
The memory order of the matrices doesn't really make a difference here because the algorithms correctly address the fields based on whether they're rows or columns.
The problem is: this hack doesn't work with SCNNode, because it's incomplete.
Demonstrate that the matrix does not work with
SCNNode
The .NET 6 matrix hack does not fix how matrices are used with
SCNNodes
. This is because, although effort was put into them being Column-major, no effort was put into making them Column-vector constructed. This means the matrices are still the transpose of what they should be. One problem was fixed, while the other, much bigger problem, was ignored.To demonstrate this confusing state of affairs. Let us now attempt to use one of the .NET 6 matrices with an
SCNNode
. We will do the simplest thing possible, translate the node using a translation matrix.This code fails even though we just showed that translation matrices work.
The problem is this: the matrix is constructed as a row-vector transformer. It is encoded column-major. Because it's column-major the Microsoft binding passes it directly to SceneKit. BUT SceneKit wants a column-vector transformer. The result is, the transform is not applied correctly (the translation is actually stored in the homogenous skew coordinates, not a good place).
This means that you still have to do a
Transpose
operation when using these matrices onSCNNodes
BUT you must use the normal (not transposed version) when transforming points.This is terribly inconsistent and makes writing 3D math impossible.
Proposed Solutions
The Easy Solution
Just go back to row-vector, row-major matrices. I know they're wrong. But they're wrong in an internally consistent way. They work the same way with point transforms as they do with
SCNNodes
and friends.Then there would be no broken breaking change and we can all get on with our lives.
The Correct Solution
SCNMatrix4
should be designed and implemented as a Column-vector, Column-major matrix (or a new struct should be introduced). It would match perfectly with SceneKit, OpenGL, mathematics, engineering, and basically everything else in the universe. It would be both internally consistent and work with SceneKit without any transpositions.I wish this was the breaking change introduced in .NET 6. Instead, the change we got is only a partial solution.
Workarounds
If you are coming into this bug, completely confused by .NET 6's matrices, let me give you this advice about how to work around its new design:
Conclusion
Thank you for coming to my TED talk. I know that everything I wrote can sound pedantic, but I hope I demonstrated through the tests and examples that the new (.NET 6)
SCNMatrix4
operates in a very inconsistent manner. This is due to not fully understanding why the original binding in SceneKit was incorrect and only doing a partial fix (fixed the element ordering but neglected to fix the construction and transformation methods).Attached Tests and Repro
SCNMatricNET6Boug.zip