question about function ConvertRightHandedMatrix4x4ToLeftHanded

plantoscloud commented 3 years ago

I dont understand what is this for and how it works..

do i need to study video game math? geometry?

im new at unity development. are they doing this because right handed coordinate system is used inside the circuits to render.. but unity the software that we use to build apps is a left handed coordinate system? is that the purpose of this code? and why do the skip matrix.M33? e.g. they didnt do matrix.M33 = -matrix.M33

`        /// <summary>
        /// Converts a right handed tranformation matrix into a left handed one
        /// </summary>
        /// <param name="matrix"> Matrix to convert </param>
        private System.Numerics.Matrix4x4 ConvertRightHandedMatrix4x4ToLeftHanded(System.Numerics.Matrix4x4 matrix)
        {
            matrix.M13 = -matrix.M13;
            matrix.M23 = -matrix.M23;
            matrix.M43 = -matrix.M43;

            matrix.M31 = -matrix.M31;
            matrix.M32 = -matrix.M32;
            matrix.M34 = -matrix.M34;

            return matrix;
        }
`

SzymonSPS commented 3 years ago

Happy to answer.

So this is a property of computer graphics, in 2D coordinates are fairly standard, 0,0 is at the bottom left, and positive X goes right, positive Y goes up. In 3D it's not as standardized, there are two popular coordinate systems: left handed and right handed.

https://www.evl.uic.edu/ralph/508S98/coordinates.html

Describes it pretty well, you can also search for other resources. Unity, OpenGL, Unreal, DirectX etc... all use one of these so there isn't a single standard. For unity we need to convert to left handed.

You ask abou33 and other things, so this gets a bit more involved and I won't go through all of the math here, there are plenty of resources but we're essentially just changing the axis from one to the other. You don't need to change all the values of the matrix to do so, it's all done by swapping rows and negation. You can find some great tutorials on this online.

jorgonz commented 3 years ago

Hi @plantoscloud. the SU SDK internally works using a right handed coordinate system while Unity uses a left handed coordinate system. to transfer spatial information from our SDK into Unity we must convert from one coordinate system to the other for our objects to be in the correct location

Right handed and Left handed coordinate system is nothing but a decision you make when working in 3d space. there isn't a standard for this, everyone can do it differently

for example when you draw a coordinate system in a piece of paper you internally make a decision of in which direction the X, Y and Z coordinates go, relative to the piece of paper, we usually associate the X axis with an arrow that goes to the right side of the paper and the Y axis usually goes up to the top of the paper, however people don't always agree in which direction the Z axis needs to go, some people draw the Z axis as if it went into the paper (inside) others draw the arrow as if it came out from the paper. Note that in some applications the Z axis can be the one going up instead of Y, and the Y axis would be the the one coming in and out of the paper, Many applications have different coordinate systems. for example Blender has the Z axis for up and down and Y for in and out

In reality any axis can be pointing in any direction, there's no strict requirement that says the X, Y and Z axis need to point in any specific direction. but these two scenarios that i mentioned; the z axis going into the paper and the z axis coming out of the paper are the two most common decisions. those decisions correspond to the Left handed and Right handed coordinate system.

the only thing that function is doing is converting from a position in space represented in a right handed system (Z coming out of the monitor/screen) and changing it to a left handed position in space (going into the screen/monitor).

for example lets imagine we have a position in space which is { 0, 0, 1} meaning X = 0, Y = 0, Z = 1 in a right handed system. if we where to use this coordinates and place a unity game object in this location. we would place the object in the wrong location in the Z axis, since Unity is left handed its Z axis goes into the monitor, meaning a positive value would go into the monitor, for a right handed system a positive value comes out of the monitor.

The only thing we need to do is change the sign of Z value. from positive to negative or vice versa. that way our right handed coordinates will place the object in the right place for a left handed space.

The only difference is that in this function our X , Y and Z values are being represented by a 4x4 transformation matrix. instead of a vector (x, y, z). a transformation matrix can store a translation, a rotation and a scale. if you invert the correct sub indices in the matrix you will invert the Z value.

jorgonz commented 3 years ago

plantoscloud commented 3 years ago

I have read that i dont need to worry about this, that its supposed to be handled by the SDK yet my use case is to give a robot directions over a turn based grid. you knows those game grids for the old D & D games or many strategy games. so i want to overlay a semi visible grid on reality, then have the human create a path over the grid that gets translated to a directive to the robot and i feel like i need to know these little nuances.

But even though i have a better understanding from your answer i still feel confused.

https://docs.microsoft.com/en-us/windows/mixed-reality/develop/unity/unity-xrdevice-advanced?tabs=mrtk#converting-between-coordinate-systems

http://www.opengl-tutorial.org/beginners-tutorials/tutorial-3-matrices/

so i tried some experiments. I created a parent and a child and put a script on a child

//the parent sitting at 10,10,10 UnityEngine.Matrix4x4 parentmatrix = transform.parent.localToWorldMatrix;

//the child at local 2,2,5 var localposition = transform.localPosition; Debug.Log($"localposition: ${localposition}");

    var newposition = parentmatrix * localposition;

    Debug.Log($"newposition:      ${newposition}");

m00: 1, m10: 0, m20: 0, m30: 0

m01: 0, m11: 1, m21: 0, m31: 0

m02: 0, m12: 0, m22: 1, m32: 0

m03: 10, m13: 10, m23: 10, m33: 1

localposition: $(2.0, 2.0, 5.0)

newposition: $(2.0, 2.0, 5.0, 0.0)

Why do you even need a matrix? if i have position vector 2,2,5 and my parent is 10,10,10 why do you need a matrix to add each of the components of the vector?

jorgonz commented 3 years ago

Hi @plantoscloud, what the SDK and the SceneUnderstandingManager component in Unity are doing is translating all the information from your Hololens device into something Unity can understand, once your real world is being represented in Unity you can use all the power of the Unity Engine to create your grid on top of the SU GameObjects.

In other words, you don't have to deal with any transformation matrix. the SDK is already doing the heavy lifting there.

Why don't you create your own component called 'GridManager', 'GridCreator', 'GridFactory' etc, and have that component gridify all the SU objects that you consider walkable. anyway this is just one approach of many. you decide how you want to implement it.

You can build the grid on top of what the SceneUnderstandingManager is giving you, you don't have to modify the SceneUnderstandingManager itself, meaning you don't have to deal with transformation matrices.

plantoscloud commented 3 years ago

i see. i guess i could just use SU in a limited fashion just to identify where the floor is so that then i can place a large, flat plane carved up by a hexigonal or square grid. i can then light up the portions of the grid corresponding to the path the human is looking or pointing at and then start a firmly lit path through that grid once the user does another hand motion.

when the robot wakes up, it can download the room and run a copy of the game with the deserialized room.
I hope that the robots camera gets placed in the room in the correct coordinates by just identifying the room.

jorgonz commented 3 years ago

Also decide whether you want to gridify the enviroment because that's the way you want your application to look like (Similar to a TableTop Game), or if you only want to gridfy to aid in path finding. if it's the later i recommend using NavMesh instead, we have a Demo of using NavMesh and Unity together in the Sample

plantoscloud commented 3 years ago

Scene Understanding MRTK has a demo of NavMesh and Unity ?

Nahh i dont just want the look actually imagine theres not look.. just a 1x1 square lights up at the XYZ you are pointing to. and the path you chose in the grid also lights up so you can press save.

later i need the robot ( who also has a hololens ) to turn on the app and find himself in the room correctly and get to the start of the path and follow the path the human laid out to get to the point to do the work at that XYZ

jorgonz commented 3 years ago

Yes, if you go to this location in this repository you can see our demo of NavMesh

The voice command to move the agent is 'Go there' if you want to try it in Hololens In the Editor the Key is 'G'

That being said, SU and MRTK are two different things. this repository allows you to use SU without MRTK, if you wish to use MRTK and SU together i suggest using the SU experimental observer added in MRTK version 2.6 we worked with the MRTK folks to put SU as a feature in MRTK 2.6

https://docs.microsoft.com/en-us/windows/mixed-reality/mrtk-unity/release-notes/mrtk-26-release-notes?view=mrtkunity-2021-05#scene-understanding-now-available-in-mrtk-as-an-experimental-spatial-awareness-observer

plantoscloud commented 3 years ago

thanks. so ive just cut and paste line by line following the logic the SceneUnderstandingManager code from unity 2019.4 to unity 2020.3 where i have a hello world project that has the MRTK ( Mixed Reality Toolkit in it ) one line that didnt compile was unityParentHolderObject.AddComponent(UnityEngine.XR.WSA.WorldAnchor)(); on line 603 so i commented it out. I think the hololens team is moving away from SomethingXR to OpenXR or something i dunno. OpenXR is suppose to be the new way from 2020 onwards it seems.

So now i have a good grasp of what you are trying to accomplish with the SDK , at least logically.

Mathematically speaking there are some things i need to come back to.

I feel i really need to understand this because ultimately i will be also interfacing in some simple ways with other devices on the robot performing SLAM... and then some of these video streams from the hololens and from the depth camera i may have to send to some vision pipeline like NVidias deepstream or Microsoft Azure Cognitive Services.

All of which provide X,Y,Z information about things in the space which i need to line up.

If you guys had a really simple to advanced tutorial that would hand hold me through the basic APIs and mathematics i need to at least understand so i can line up all these elements that would be awesome.

I tried to find information about Windows Perception and the only thing you have is some QR code sample project so thats assuming im going to know what the heck the code is actually doing under the covers.

If you guys want to somehow sponsor my work or something like give me a job working on independent projects for microsoft im all ears? free azure for a year? and even give me a team that already gets some of the stuff i dont. like a game engineer, a computer vision engineer, even someone that can help design the ARM and the drive system. id be eternally grateful :)

plantoscloud commented 3 years ago

continued.... the thing is this.

there are robots out there from boston dynamics like stretch https://www.youtube.com/watch?v=yYUuWWnfRsk

this is cool and everything but mainstream adoption of robotics is not cost effective. you cannot buy a robot from the store and get it to work in your backyard unless your willing to pay the companies engineers 200 dollars an hour for a week to set it up for you. and thats because everyones solution is to lean to heavy on A.I.

I think the most cost effective solution is to use hololens to draw up work in 3d and then the robot itself can follow that work through the game camera while using those advanced A.I. for tactical things like object identification, or inverse kinematics or something.

But when you hire a worker in mcdonalds the first thing you do is show him where the cash register is, where the mops are and where the frozen patties are kept. and this is different for every mcdonalds even though its the same work.

SzymonSPS commented 3 years ago

The feedback is awesome, we are discussing more step by step tutorials because this will be super important.

With regards to robotics and simulation, agreed, and stay tuned... we're planning some things in this space.

plantoscloud commented 3 years ago

The feedback is awesome, we are discussing more step by step tutorials because this will be super important.

With regards to robotics and simulation, agreed, and stay tuned... we're planning some things in this space.

please add me to any early access programs. thanks.

microsoft / MixedReality-SceneUnderstanding-Samples

question about function ConvertRightHandedMatrix4x4ToLeftHanded #17