CloudCompare / CloudComPy

Python wrapper for CloudCompare
Other
283 stars 40 forks source link

Use cloudComPy.ccPointCloud with a numpy array (bug in cloud.coordsFromNPArray_copy) #29

Open fredpeertje opened 2 years ago

fredpeertje commented 2 years ago

Is it possible to use cloudComPy.ccPointCloud by providing a numpy array to it instead of the filename of a point cloud?

In my code, I do some preprocessing on the point clouds before performing (and I don't want to save and load it again...):

cc.M3C2.computeM3C2([cloud,cloud1], paramFilename)
fredpeertje commented 2 years ago

Found it!

import laspy
pointcloud = laspy.read('test.laz')
points = np.vstack((pointcloud.x, pointcloud.y, pointcloud.z)).T.astype(np.float32) 
cloud = cc.ccPointCloud('cloud')
cloud.coordsFromNPArray_copy(points)
fredpeertje commented 2 years ago
Schermafbeelding 2022-03-21 om 11 28 25

Hmm, looks like there is a bug when using cloud.coordsFromNPArray_copy. The length of the numpy array and the ccPointCloudPy differ (10 million more points).

prascle commented 2 years ago

Hello, Concerning your last comment, the log give the number of bytes, not the number of points. Each coordinate is a float (4 bytes), 3 coordinates per point so 908705 3 4 = 10904460 seems OK ! Concerning the first comment, I am not sure to understand. in cc.M3C2.computeM3C2([cloud,cloud1], paramFilename) cloud and cloud1 are ccPointCloud objects and can be obtained by a lot of ways, either by loading a file cloud = cc.loadPointCloud('test.laz') and/or by creating/manipulating the coordinates and fields with numpy. See for instance test017py to create a cloud from scratch. I plan to make a kind of tutorial to introduce properly the functions of CloudComPy but it is not obvious to define the right structure... in the meantime, you'll have to search the Python test scripts, and the [reference documentation]. (https://www.simulation.openfields.fr/documentation/CloudComPy/html)

Regards, Paul

fredpeertje commented 2 years ago

Hello Paul, Oke, thank you for your answer!

I tried to rewrite the test030py example script of the M3C2 implementation. So, instead of loading a point cloud with cc.loadPointCloud('test.laz'), it converts a numpy array (from an already loaded point cloud) to a ccPointCloud object. I thought I could do it using the method explained in test017py, but the M3C2 is extremely slow after...

It takes ~300 seconds now, compared to ~20 seconds when loading using cc.loadPointCloud('test.laz'). I run the code on a linux machine.

prascle commented 2 years ago

OK, I understand better: your point cloud is not originally loaded with CloudComPy but with another tool and you need to convert it to the right object type (I should have guessed). Your solution is correct. If the point cloud obtained by numpy or by loading the file is the same, I don't see the reason for the performance problem, unless you saturated the memory in the slower case. Let me know if this is not the case, I will check.

fredpeertje commented 2 years ago

Good thing to check the memory indeed. But, a lot free memory left.

I observed this float difference when comparing laspy read with cc.loadPointCloud read:

Schermafbeelding 2022-03-21 om 18 03 46

An example point in the point cloud has X value 119331.20 and Y value 485139.50.

prascle commented 2 years ago

Well, I'm lost. Do you have any small sample files for me to look at? I can't reproduce the problem. I installed laspy quickly, but it doesn't read the compressed (.laz) files I have. I suppose I will be able to fix that.

fredpeertje commented 2 years ago

Sample files can be found here. The point clouds ending with _xyz.

Code 1:

import laspy # pip install laspy[lazrs]
pointcloud = laspy.read('sidewalk_filter_xyz.laz')
points = np.vstack((pointcloud.x, pointcloud.y, pointcloud.z)).T.astype(np.float32)

pointcloud1 = laspy.read('sidewalk_filter2_xyz.laz')
points1 = np.vstack((pointcloud1.x, pointcloud1.y, pointcloud1.z)).T.astype(np.float32)

cloud = cc.ccPointCloud('sidewalk_filter_xyz')
cloud.coordsFromNPArray_copy(points)

cloud1 = cc.ccPointCloud('sidewalk_filter2_xyz')
cloud1.coordsFromNPArray_copy(points1)

Code 2:

# Load point clouds using filename
cloud = cc.loadPointCloud('sidewalk_filter_xyz.laz')
cloud1 = cc.loadPointCloud('sidewalk_filter2_xyz.laz')
fredpeertje commented 2 years ago

@prascle updated the files in the GitHub branch shared in the post above

prascle commented 2 years ago

I get it! I'm sorry, I forgot that, by default, loadPointCloud does a coordinate change to avoid handling large numbers. You can test with the GUI to see how CloudCompare handles this. When you only work with CloudCompare, it's completely transparent and you forget about it. To have the same values, you should use: cloud = cc.loadPointCloud("sidewalk_filter_xyz.laz", cc.CC_SHIFT_MODE.NO_GLOBAL_SHIFT)

But I think it's still a good idea to apply a shift of coordinates, it is far more precise this way. Oh, I just see your last post, I suppose my comment is still valid.

fredpeertje commented 2 years ago

Thank you, this solves one issue that I saw this morning: Schermafbeelding 2022-03-22 om 11 20 24

Schermafbeelding 2022-03-22 om 11 20 55

Do you also have the long computation time of M3C2 when loading point clouds in CloudComPy using coordsFromNPArray_copy? Also the result of the point cloud is empty.

fredpeertje commented 2 years ago

Ah, I understand your last comment now better. I see in the GUI that there is indeed also a global shift applied.

Still, there are two issues:

prascle commented 2 years ago

I learned something about numpy: when you transpose a ndarray, it is not reordered in memory, you have to ask explicitly. There is a problem in CloudComPy regarding the storage of the coordinate shift. Here is a working script with a shift of coordinates manually set.

import os
import sys
import math
os.environ["_CCTRACE_"]="ON"

import cloudComPy as cc
import numpy as np
import time
import laspy # pip install laspy[lazrs]

paramFilename = "m3c2_params.txt"

start = time.time()

isLaspy=False

import cloudComPy.M3C2

step1 = time.time()
if isLaspy:
    pointcloud = laspy.read('sidewalk_filter_xyz.laz')
    points = np.vstack((pointcloud.x, pointcloud.y, pointcloud.z)).transpose().astype(np.float32).copy(order='C')
    print("ndarray memory C_CONTINUOUS", points.flags['C_CONTIGUOUS'])
    points[:,0] -= 119300
    points[:,1] -= 485100
    cloud = cc.ccPointCloud('sidewalk_filter_xyz')
    cloud.coordsFromNPArray_copy(points)

    pointcloud1 = laspy.read('sidewalk_filter2_xyz.laz')
    points1 = np.vstack((pointcloud1.x, pointcloud1.y, pointcloud1.z)).transpose().astype(np.float32).copy(order='C')
    points1[:,0] -= 119300
    points1[:,1] -= 485100
    cloud1 = cc.ccPointCloud('sidewalk_filter2_xyz')
    cloud1.coordsFromNPArray_copy(points1)

else:
    cloud  = cc.loadPointCloud('sidewalk_filter_xyz.laz',  cc.CC_SHIFT_MODE.XYZ, 0, -119300.0, -485100.0, 0.)
    cloud1 = cc.loadPointCloud('sidewalk_filter2_xyz.laz', cc.CC_SHIFT_MODE.XYZ, 0, -119300.0, -485100.0, 0.)
step2 = time.time()
print(" -- load ", step2 - step1)

cloud2 = cc.M3C2.computeM3C2([cloud,cloud1], paramFilename)

step3 = time.time()
print(" -- M3C2 ", step3 - step2)

if isLaspy:
    cc.SaveEntities([cloud, cloud1, cloud2], "M3C2l.bin")
    cc.SavePointCloud(cloud2, "M3C2l.laz")
else:
    cc.SaveEntities([cloud, cloud1, cloud2], "M3C2.bin")
    cc.SavePointCloud(cloud2, "M3C2.laz")

end = time.time()
print(" -- save ", end - step3)
print(" -- total ", end - start)

You have to change the boolean isLaspy to test with or without laspy/numpy. I save everything to compare the data and result. I obtain the same result in both cases, with approximately the same time. For the shift of coordinates, the idea is to recenter the bounding box at the origin, to improve accuracy and speed. Paul

prascle commented 2 years ago

I have tested with the .laz files from #issue28:

import os
import sys
import math
os.environ["_CCTRACE_"]="ON"

import cloudComPy as cc
import numpy as np
import time

cc.initCC()  # to do once before using plugins or dealing with numpy

paramFilename = "m3c2_params.txt"

start = time.time()

if cc.isPluginM3C2():
    import cloudComPy.M3C2
    step1 = time.time()
    cloud = cc.loadPointCloud('sidewalk_filter.laz',  cc.CC_SHIFT_MODE.XYZ, 0, -119300.0, -485100.0, 0.)
    step2 = time.time()
    print(" -- load1 ", step2 - step1)
    cloud1 = cc.loadPointCloud('sidewalk_filter2.laz',  cc.CC_SHIFT_MODE.XYZ, 0, -119300.0, -485100.0, 0.)
    step3 = time.time()
    print(" -- load2 ", step3 - step2)

    cloud2 = cc.M3C2.computeM3C2([cloud,cloud1], paramFilename)

    step4 = time.time()
    print(" -- M3C2 ", step4 - step3)

    cc.SaveEntities([cloud, cloud1, cloud2], "M3C2.bin")
    cc.SavePointCloud(cloud2, "M3C2.laz")

    end = time.time()
    print(" -- save ", end - step4)
    print(" -- total ", end - start)

I do not see any problem here, and it looks fine for me. capture capture2

fredpeertje commented 2 years ago

Wow, .copy(order='C') does the trick. I now also obtain the same result in both cases!

Thank you for the explanation, this was also new for me: when you transpose a ndarray, it is not reordered in memory, you have to ask explicitly.