tmontaigu / laszip-python

Python bindings for LASzip
MIT License
4 stars 7 forks source link

Strange issue with a Bus Error #3

Closed jpswensen closed 2 years ago

jpswensen commented 2 years ago

I am going to preface this by saying that I followed similar procedures on a different device and everything is working fine. I had been using LASzip + laszip-python + pylas on a different rasbperry pi machine with no issues.

Now, we are working on getting a deployment image set up (and switch to laspy) and so I was rebuilding LASzip + laszip-python + laspy for this "pristine" setup. The problem is that the same code I was using before started giving a Bus Error. So, my initial debug steps was to make sure the data looked good, tried to write as a LAS and not as a LAZ, and then start injecting a bunch of debug statements to figure out where this was happening (Bus Error doesn't give a beautiful stack trace like most errors). I also made a test program that was as simple as

import laspy

las = laspy.read('test.las')
las.write('test.laz')

My first check was to make sure that it wasn't some issue with the data. So I took the same test.las file over to the other machine and it ran perfectly. So, I suspect that it is some weirdness of how I build the packages, or other weird interactions with libraries versions on the new system that weren't on the old machine, but don't know quite how to hunt this down.

I was able to at least identify where the problem is occurring. In laszipper.cpp in the method LasZipper::compress, the Bus Error is happening on the second iteration of the for loop that is iterating over the points in the last step of

if (laszip_write_point(m_writer))
{
    throw laszip_error::last_error(m_writer);
}

I went ahead and tried to print out some of the element of the m_point that was being written, but couldn't see anything wrong with the data.

I guess I am looking for some guidance on what to try next. I haven't dug into LASzip itself to try and see where inside of that library that the problem is occuring. The weird thing is that I usually see Bus Errors when bad memory access issues are happening (like trying to access something outside of an array bounds). This particular test file only has 750k points, so it is not a huge point cloud.

jpswensen commented 2 years ago

I can also upload the test.las file I am using if that would help. Unfortunately, I feel like I am just reporting something that most others won't be able to replicate because it is probably unique to some sort of weird dependecy issues and the libraries that are installed on the raspberry pi.

jpswensen commented 2 years ago

Sorry to keep commenting (maybe I should just edit my earlier comments), but I found out where the problem seems to be in the guts of LASzip. I rebuilt LASzip with debug symbols and the crash is happening at

LASwriteItemCompressed_POINT14_v3::write (this=0x602800, item=0x5fff20 "\300D", context=@0xbeffe618: 0)
    at /home/pi/src/rpi-realsense-bootstrap/LASzip/src/laswriteitemcompressed_v3.cpp:505
505   BOOL gps_time_change = (((LASpoint14*)item)->gps_time != ((LASpoint14*)last_item)->gps_time);

I am not actually setting the gps time in this particular file. Maybe I will go back and explicitly set that and try again.

One more update though maybe this should be posted on the laspy Issue tracker. I had been creating the file with code that looked like

    las = laspy.create(point_format=8, file_version='1.4')

    las.x = colorizedPCL[:,0]
    las.y = colorizedPCL[:,1]
    las.z = colorizedPCL[:,2]

    las.red = np.uint16(colorizedPCL[:,3])
    las.green = np.uint16(colorizedPCL[:,4]) 
    las.blue = np.uint16(colorizedPCL[:,5]) 

    las.intensity = colorizedPCL[:,6]

If I create the LasData object with a simple las = laspy.create() then it writes out to file perfectly fine. I did another test with point format=2, and it is also working.

I'm not sure why I chose point format 8 when I first started using LAS files, but I think point format 2 is sufficient for what I am storing. So I have a workaround, but have zero clue why there appears to be a really bad memory corruption issue when I try point format 8.

tmontaigu commented 2 years ago

Your bus error and the fact that LASzip crashes at

LASwriteItemCompressed_POINT14_v3::write (this=0x602800, item=0x5fff20 "\300D", context=@0xbeffe618: 0)
    at /home/pi/src/rpi-realsense-bootstrap/LASzip/src/laswriteitemcompressed_v3.cpp:505
505   BOOL gps_time_change = (((LASpoint14*)item)->gps_time != ((LASpoint14*)last_item)->gps_time);

makes me think the way laszip is written it creates a memory access to an address not properly aligned, something that I think does not create problem on x86 (appart from being slower) but on other architecture it may be completely invalid.

What architecture are you running it on ?

jpswensen commented 2 years ago

This is arm64 on a Raspberry Pi 4 (though I am using a arm32 Linux distribution right now). I will try to work up an example that completely takes laszip-python and laspy out of the picture and submit to the LASzip github Issues.

Feel free to close this as, after all my digging, it looks like a LASzip issue, rather than further up the food chain.