Open christuchez opened 10 months ago
We started asking this question also here #10 but we could not find an answer yet.
What I did was output the values to a binary file. You can then read the file to find the values.
`
depth_data = model.infer_pil(image, output_type="tensor")
# Move to CPU and convert to float32
depth_data_cpu = depth_data.cpu().type(torch.float32)
# Convert to numpy array and flatten
depth_data_numpy = depth_data_cpu.numpy()
#combine all rows
depth_data_flat = depth_data_numpy.flatten()
# Output binary file path
output_path = os.path.join(image_directory, f"depth.bin")
# Write depth data to binary file
with open(output_path, 'wb') as file:
file.write(depth_data_flat.tobytes())
`
What I did was output the values to a binary file. You can then read the file to find the values.
` # Estimate depth directly from PIL image running on GPU depth_data = model.infer_pil(image, output_type="tensor")
# Move to CPU and convert to float32 depth_data_cpu = depth_data.cpu().type(torch.float32) # Convert to numpy array and flatten depth_data_numpy = depth_data_cpu.numpy() #combine all rows depth_data_flat = depth_data_numpy.flatten() # Output binary file path output_path = os.path.join(image_directory, f"depth.bin") # Write depth data to binary file with open(output_path, 'wb') as file: file.write(depth_data_flat.tobytes())
`
It is only the depth. I think what he wants is the corresponding 3D coordinate of the pixel. That is also what I am looking for. Do we have any solution for it?
For the x and y points you just need the pixel location and projection factor for x and y.
Unfortunately, the projection factors are specific to the image and camera used to take it, so if you don't know them, you'll need to tweak until they look right.
z = value you read x = z projectionFactor.x (pixel.x - center.x)/width y = z projectionFactor.y (pixel.y - center.y)/height
For example: The image is 192x384, so the center is 96x192. projectionFactor = (1.1,1.2);
If you read pixel (12,23) with a z = 5.1m
x = 5.1 1.1 (12-96)/192 = -2.45 y = 5.1 1.2 (23-192)/384 = -2.69
The positions are in camera space.
projectionFactor
How do I find projectionFactor
If I take an image, generate the depth map, then generate 3D points how can I map a specific 2D pixel to a 3D value? For example if I have pixel (34, 56) in my original image then in the depth map it will still be (34,56) so I can get the depth at that pixel value but how can I get the values from the 3D mesh?