Open marcelbrucker opened 2 years ago
Maybe related to #4116
We will look into the issue with Vector3dVector
. Meanwhile, you might be interested to know, that open3d supports custom attributes such as intensity in the new tensor-based module i.e. open3d.t.io.read_point_cloud
, will support the custom attributes, so you may not require to use pandas.
Thank you!
I tried out open3d.t.geometry.PointCloud
a couple of weeks ago but it could not replace open3d.geometry.PointCloud
entirely as plotting in the standard visualizer, plane fitting, and some other features were not yet available in the release 0.15.2
.
The main reason is that data_pd
is not C-contiguous (The default for NumPy array, but if you transpose an array, it becomes F-contiguous). Open3D uses C-contiguous arrays internally for Eigen. When an array is not C-contiguous, Vector3dVector
does the conversion row-by-row, leading to slow performance.
ipdb> data_pd.flags
C_CONTIGUOUS : False
F_CONTIGUOUS : True
OWNDATA : False
WRITEABLE : True
ALIGNED : True
WRITEBACKIFCOPY : False
ipdb> data_np.flags
C_CONTIGUOUS : True
F_CONTIGUOUS : False
OWNDATA : True
WRITEABLE : True
ALIGNED : True
WRITEBACKIFCOPY : False
The quick fix is to convert the array to a contiguous array before calling Vector3dVector
. The conversion still has overhead, but it is already much faster than before.
t2 = timer()
data_pd = np.ascontiguousarray(data_pd)
pc_points1 = o3d.utility.Vector3dVector(data_pd)
t3 = timer()
pc_points2 = o3d.utility.Vector3dVector(data_np)
t4 = timer()
On my machine
# Before
Vector3dVector data_pd: 35.64ms
Vector3dVector data_np: 0.20ms
# After quick fix
Vector3dVector data_pd: 0.47ms # time for both ascontiguousarray and Vector3dVector
Vector3dVector data_np: 0.15ms
We'll be looking into Vector3dVector
and see if the c-contiguous conversion can be handled internally in a more efficient way.
Thank you @yxlao!
We will look into the issue with
Vector3dVector
. Meanwhile, you might be interested to know, that open3d supports custom attributes such as intensity in the new tensor-based module i.e.open3d.t.io.read_point_cloud
, will support the custom attributes, so you may not require to use pandas.
Maybe interesting in this context: I also kept using pandas when I tested Open3D Tensors because it was way faster to read my PCD file.
open3d.t.io.read_point_cloud: 250ms
pd.read_csv: 70ms
I use PCD files with 131072 points, which have nine attributes with different data types.
@marcelbrucker is it possible to share the PCD file (or a truncated version of the PCD file), and the code you use? We can do some debugging to improve the t.io.read_point_cloud
speed.
import open3d as o3d
import numpy as np
import pandas as pd
from timeit import default_timer as timer
device = o3d.core.Device("CPU:0")
input_file = "Example_PCD.pcd"
# Read PCD file with various attributes
t1 = timer()
pcd = o3d.t.io.read_point_cloud(input_file)
xyz = pcd.point["positions"].numpy()
intensity = pcd.point["intensity"].flatten().numpy()
ring = pcd.point["ring"].flatten().numpy()
ambient = pcd.point["ambient"].flatten().numpy()
additional_attributes = np.hstack((intensity[:, None], ring[:, None], ambient[:, None]))
pc = o3d.geometry.PointCloud()
pc.points = o3d.utility.Vector3dVector(np.ascontiguousarray(xyz))
pc.normals = o3d.utility.Vector3dVector(np.ascontiguousarray(additional_attributes))
t2 = timer()
# Read PCD file with pandas
# pcd_pd = pd.read_csv(input_file, sep=" ", header=0, names=["x", "y", "z", "intensity", "t", "reflectivity", "ring", "ambient", "range"], skiprows=10, dtype={"x": np.float32, "y": np.float32, "z": np.float32, "intensity": np.float32, "t": np.uint32, "reflectivity": np.uint16, "ring": np.uint8, "ambient": np.uint16, "range": np.uint32})
pcd_pd = pd.read_csv(input_file, sep=" ", header=0, names=["x", "y", "z", "intensity", "t", "reflectivity", "ring", "ambient", "range"], skiprows=10, dtype=float)
xyz_pd = pcd_pd.loc[:, ["x", "y", "z"]].to_numpy()
additional_attributes_pd = pcd_pd.loc[:, ["intensity", "ring", "ambient"]].to_numpy()
pc_pd = o3d.geometry.PointCloud()
pc_pd.points = o3d.utility.Vector3dVector(np.ascontiguousarray(xyz_pd))
pc_pd.normals = o3d.utility.Vector3dVector(np.ascontiguousarray(additional_attributes_pd))
t3 = timer()
print(f"PCD with open3d.t.io: \t\t {((t2 - t1) * 1e3):.2f}ms")
print(f"PCD with pandas: \t\t {((t3 - t2) * 1e3):.2f}ms")
PCD with open3d.t.io: 465.82ms
PCD with pandas: 81.87ms
Considering the datatypes triples the time needed by pandas but it's still faster.
@marcelbrucker thank you, that is helpful. We'll look into that.
I have simular problems, but i'm not sure if the reason is the same. The code below takes about 1 second, the function 'remove_zeros' is to delete the [0,0,0] in the points, or it will be even slower.
The pcdmap_im
numpy array is from my pybind11 function, i convert the cv::Mat to py::array_t
open3d 0.15.2
points = remove_zeros(pcdmap_im)
points = np.ascontiguousarray(points)
start = time.time()
pcd = o3d.geometry.PointCloud()
pcd.points = o3d.utility.Vector3dVector(points)
print("convert pcd to o3d takes: ", time.time()-start)
cv::Mat image = self.GetPcdMap(); // here return a cv::Mat with 3 channels
py::array_t<float> result = py::array_t<float>(image.rows * image.cols * image.channels());
auto buf = result.request();
float *ptr = (float *) buf.ptr;
for (int i = 0; i < image.rows; i++) {
for (int j = 0; j < image.cols; j++) {
for (int k = 0; k < image.channels(); k++) {
*ptr++ = image.at<cv::Vec3f>(i, j)[k] * 0.001;
}
}
}
return result;
0 frames were cleared from the cyclic buffer
Frame was triggered, Frame Id: 270
convert pcd to o3d takes: 0.8412952423095703
Frame was triggered, Frame Id: 271
convert pcd to o3d takes: 0.9536771774291992
Frame was triggered, Frame Id: 272
convert pcd to o3d takes: 0.8492419719696045
Frame was triggered, Frame Id: 273
convert pcd to o3d takes: 0.9604189395904541
Frame was triggered, Frame Id: 274
convert pcd to o3d takes: 0.8200759887695312
disconnect camera
Could you please tell me why the conversion code from numpy to o3d.points can be such slow? Thanks a lot!
Checklist
master
branch).Describe the issue
I have a strange performance issue with
open3d.utility.Vector3dVector()
. I use PCD files to visualize 3D point cloud data withopen3d
and I read these files withpandas
because I need additional attributes like intensity from the PCD file.open3d.utility.Vector3dVector()
is much slower when I use the numpy array coming from a file than when I use use some random numpy array.Steps to reproduce the bug
Error message
No response
Expected behavior
I expect comparable speed.
Open3D, Python and System information
Additional information
No response