Closed andybak closed 3 years ago
Follow up question - what are the .conf files for? Are there some docs on this I've overlooked?
Hello Andy,
yes, you are right. See this simple example of how to load a .depth
file:
import numpy as np
import cv2
import liblzfse # https://pypi.org/project/pyliblzfse/
def load_depth(filepath):
with open(filepath, 'rb') as depth_fh:
raw_bytes = depth_fh.read()
decompressed_bytes = liblzfse.decompress(raw_bytes)
depth_img = np.frombuffer(decompressed_bytes, dtype=np.float32)
depth_img = depth_img.reshape((640, 480)) # For a FaceID camera 3D Video
# depth_img = depth_img.reshape((256, 192)) # For a LiDAR 3D Video
return depth_img
if __name__ == '__main__':
depth_filepath = '/tmp/depth_0.lzfse'
depth_img = load_depth(depth_filepath)
cv2.imshow('Depth', depth_img)
cv2.waitKey(0)
As you wrote, the decompressed .depth
file is just a buffer of raw float32
depth bytes (each float32
value is a depth value in meters). There are 49 152 (i.e. 192×256) values for a LiDAR frame and 307 200 (i.e. 480×640) values for a FaceID frame.
The .conf
files contain confidence map for each frame, which is of the same size as the depth map and for each pixel of the depth map it contains an uint8
number in the range 0-2, which suggest the confidence that the sensed LiDAR depth is "correct". In other words, it is a measure of depth data quality.
I think this answers your question, so I am closing this issue, but feel free to ask follow-up questions.
(thanks! the above was really helpful for me. However - will anyone else find it easily as it is in a closed github issue? Part of my reason for opening this was to suggest that something like the above would be a great addition to the docs)
You are right, thanks for reminding me that. I added link to this Issue into the Wiki.
Had the same confusion and found this issue before going to the Wiki. It would be very helpful to add some mention of it into the main Readme
Also on a related note - is it possible to get distance in meters from an exported RGBD video?
OK, I will mention the Wiki in the Readme the next time I will push an update.
As for getting the distance in meters from exported RGBD videos: yes, it is possible. I described how to do it in the Readme of this demo.
got it, thank you for a great app and library! please do add landscape mode for the iPad someday though
Thank you for the suggestion, noted! I will include landscape mode in a future update.
Two follow-up questions:
To answer your questions:
.r3d
file, you will see a metadata
file. This is the JSON config file.2
is high confidence, 1
is "lower" confidence and 0
is the lowest confidence.Hi, I wonder if you've updated both LiDAR & FaceID depth resolution?
depth_img = depth_img.reshape((1280, 960)) # For a FaceID camera 3D Video
depth_img = depth_img.reshape((512, 384)) # For a LiDAR 3D Video
Is the above correct? Another three questions:
Thank you very much!
Addition on this issue regarding Apple ARKIT depth confidence map:
From https://developer.apple.com/documentation/arkit/arconfidencelevel, there are only three levels of confidence map values.
case low Depth-value accuracy in which the framework is less confident.
case medium Depth-value accuracy in which the framework is moderately confident.
case high Depth-value accuracy in which the framework is fairly confident.
Hope it helps future users to understand why confidence map is only consist of 0, 1, 2 !
I've exported a recording in the native r3d format and I'm attempting to read the depth data
>>> pth = 'winhome/Documents/3D Scans/2020-10-28--15-01-03/rgbd/1.depth'
>>> fh = open(pth, "rb")
>>> compressed = fh.read()
>>> decompressed = liblzfse.decompress(compressed)
But then I'm not sure what to do with the decompressed data. Is it just a case of reading each 4 bytes, unpacking them to a single precision float? The jpgs are 192x256 and doing the maths on that seems to add up: 192 x 256 x 4 = 196608 and len(decompressed) gives me 196608.
So this looks right:
>>> f = [struct.unpack('f', d[x:x+4]) for x in range(0,len(d),4)]
Then I guess I can just write f into any image format that supports floating point (.hdr or .exr maybe)
Am I on the right lines? Are the values linear distances from the camera?
If so - it would be nice to add this to the docs.