Unity-Technologies / arfoundation-samples

Example content for Unity projects based on AR Foundation
Other
3.02k stars 1.12k forks source link

How to get ar camera/preview view co-ordinates from CpuImage coordinates? #972

Closed manwithsteelnerves closed 2 years ago

manwithsteelnerves commented 2 years ago

How do I... CpuImage has totally different co-ordinate system compared to the camera view under AR Session Origin. Are there any available methods to do this conversion?

I'm currently detecting bounds for an object by passing CpuImage to my ML engine. Once the bounds are detected, they are in CpuImage coordinates. These are very different from the camera preview co-ordinates as CpuImage will be in a different resolution.

I'm looking for a way to convert a point in CpuImage to a point in Camera preview. Are there any built in api methods to do this?

DavidMohrhardt commented 2 years ago

There is no built-in API for converting the CPUImage to the same coordinates you'd see on your screen. You can look into the ARCore/ARKit background shader to see how the image is mapped to the phone screen however. You'd need to apply the same texture transforms to your bounds to compute the coordinates on the phone screen.

An alternative approach would be to blit the background using the current background material and inject an RequestAsyncReadback then use that in you ML engine. This may be a little easier overall.

manwithsteelnerves commented 2 years ago

Thanks for the details. I will check the shader code once. Also wondering if the matrices info passed in the frameReceived event args can help in transforming the point from cpuImage. Where can I find better information about the displaymatrix. It would be great if anyone here can explain.

Only problem I see with the blitting approach is the final image format. I need the format to be YUV_420_888(or atleast NV21) provided by cpuImage by default but RequestAsyncReadback doesn't allow me to specify that format. If cpuImage internally does the conversion (most likely it won't as cpu images seems to be part of arcore native c api feature which gives data directly in YUV_420_888 format), then I can avoid conversion on my end.

manwithsteelnerves commented 2 years ago

@DavidMohrhardt @tdmowrer Any thoughts?

tdmowrer commented 2 years ago

Also wondering if the matrices info passed in the frameReceived event args can help in transforming the point from cpuImage. Where can I find better information about the displaymatrix. It would be great if anyone here can explain.

See https://github.com/Unity-Technologies/arfoundation-samples/blob/d71529a1767efbaeb78732f4b23046e6e19c2717/Assets/Scripts/DisplayDepthImage.cs#L328-L370 for an example.

Only problem I see with the blitting approach is the final image format. I need the format to be YUV_420_888(or atleast NV21) provided by cpuImage by default but RequestAsyncReadback doesn't allow me to specify that format. If cpuImage internally does the conversion (most likely it won't as cpu images seems to be part of arcore native c api feature which gives data directly in YUV_420_888 format), then I can avoid conversion on my end.

On Android, the XRCpuImage should already be in YUV_420_888 (for the camera image; depth formats are different). See XRCpuImage.Format. That's what the ARCore C API provides as well.

manwithsteelnerves commented 2 years ago

Looks like I need to explore more on the display matrix part. I tried with display matrix but unfortunately It didn't work the way I expected. Most likely I could be doing it wrong.