Poor performance - Githubissues

BingEdison commented 4 years ago

Howdy!, First I wanted to thank you for making livox accessible via python. The issue I’m having is how slow write speeds are on pi with openpylivox. When working with c++ livox sdk this is not an issue , I’m assuming it is due to writing as ascii. Could you provide any insight how to improve write speed performance ? Write time are nearly 10 seconds for a 4 second capture . switching to binary has yielded 2 second write times but considering the point count 300k this does not seem right, if anyone has any insight it would be much appreciated

Cheers, _Eddy

ryan-brazeal-ufl commented 4 years ago

Hi Eddy,

Always glad to hear that the package is being used. As you mentioned, the write speeds are certainly not optimized for performance in the current version. Right now the package implements (near) all SDK functions as of firmware 03.06. Here comes the age old excuse...but I'm swamped right now with my academic research. However, way down on the list I have written, OpenPyLivox update, simply meaning I haven't forgotten about this. Hopefully within the next few weeks I'll be able to get back to the development.

Curious, are you hoping to use the collected observations in a near real-time scenario (e.g., a point cloud viewer or the like) or you are just wanting to be able to have the observations written to file when data collection stops?

-RB

daniellukeahead commented 4 years ago

Hi Eddy and Ryan, Thank you for raising this problem. I have been experiencing the same issue: 3 minutes lidar data and it will take more than 5 minutes to write a CSV file.
This also reminds me of another issue I experienced, the scripts will break when it occupies too much memory if you set duration a little bit longer, say, in my case, I have a Pi-4B (4G memory), and I can only capture less than 4 minutes of data with Mid-40. I look forward to your update on this and once again, thanks for these lovely scripts.

ryan-brazeal-ufl commented 4 years ago

Hi Daniel(?) (and Eddy), I appreciate the constructive feedback! Certainly an oversight on my part with regards to the current library not supporting real-time/long duration data logging. Naively, my requirements at the time of development were only for ~10 second datasets. However, a current research thread I am on also requires larger (i.e., 20+ minute) datasets from the MID-40 and the Horizon sensors to be collected using a R.Pi. So, VERY SOON, I will be furthering the development of OpenPyLivox to include, efficient large dataset collection, latest firmware support, Horizon sensor support, LAS (and possibly LVX) file support, and Livox Hub support (Hub support is lowest priority right now). Unfortunately I do not have a firm ETA on this next round of development, so please don't hold your breath. But it is coming!! All the best, Stay Safe!!! Ryan

daniellukeahead commented 4 years ago

Hi Ryan, Glad you will be working on further development soon. The work you're going to do is definitely useful and will bring its application to the next level. I will be following it. Thanks. Stay safe! Daniel

ryan-brazeal-ufl commented 4 years ago

Hi Eddy and Daniel, I had a little bit of time to look into the current poor performance of the library. As a quick solution, and hopefully it proves useful, I have updated the library to v1.0.1 with a new Data Control Method .dataStart_RT(), see section 4.6 of the Wiki. Simply replace the .dataStart() method in your code with the new .dataStart_RT() method and things should be improved. I tested a 300 second (5 min) data collection on my MacBook using the old .dataStart() method and the new .dataStart_RT() method, and the memory usage went from ~5GB down to <400MB. Hopefully a Raspberry Pi (or other smaller computers) experiences memory performance improvements as well. This isn't my final update for improving the performance of the library, but hopefully it helps during the interim period until I can get v1.1.0 released. Please let me know your experiences (good and ESPECIALLY bad) with this update and the new .dataStart_RT() method. Thanks, Ryan

daniellukeahead commented 4 years ago

Hi Ryan, It is nice to see your update. I had a quick test on the .dataStart_RT() method on MID-100 with Rpi 4 as host. I noticed you have been writing to CSV during the capturing loop to get real-time data. However, it seems to me this method will significantly reduce the iterating speed. For example, It was expected to capture 30,000 points per sec for Mid-100. In my test, I run for 200 seconds and only got around 400,000 points in total (Figure below)（~6000,000 were expected）.
Also I found the codes like ""{0:.3f}".format(float(struct.unpack('<i',data_pc[bytePos:bytePos+4])[0])/1000.0)" is time consuming and will slow down the loop speed in RPI 4.

This new method did improve memory performance! The memory usage was staying around 500MB and will not keep increasing as before.

So my quick response is that both 'writing CSV file' and 'doing struct.unpack()' in the loop have impacted the continuity of capturing points. The main loop seems to be vulnerable to some 'heavy' codes. Could you please advise if you have the whole points written in the CSV file when you run it on your Macbook? That may tell how much difference can the hardware configuration make on the performance.

Thanks， Daniel

BingEdison commented 4 years ago

Try saving packet in its entirety as binary data and parsing later ,after making the change capture performance has been blazingly fast .

On Friday, May 29, 2020, daniel notifications@github.com wrote:

Hi Ryan, It is nice to see your update. I had a quick test on the .dataStart_RT() method on MID-100 with Rpi 4 as host. I noticed you have been writing to CSV during the capturing loop to get real-time data. However, it seems to me this method will significantly reduce the iterating speed. For example, It was expected to capture 30,000 points per sec for Mid-100. In my test, I run for 200 seconds and only got around 400,000 points in total (Figure below)（~6000,000 were expected）. Also I found the codes like ""{0:.3f}".format(float( struct.unpack('<i',data_pc[bytePos:bytePos+4])[0])/1000.0)" is time consuming and will slow down the loop speed in RPI 4. [image: Image_20200528235445] https://user-images.githubusercontent.com/49530841/83226851-204b3500-a140-11ea-970c-ef6dd8eee4a5.jpg

This new method did improve memory performance! The memory usage was staying around 500MB and will not keep increasing as before.

So my conclusion is that both 'writing CSV file' and 'doing struct.unpack()' in the loop have impacted the continuity of capturing points. The main loop seems to be vulnerable to some 'heavy' codes. Can you tell if you have the whole points written in the CSV file when you run it on your Lidar? That may tell how much the hardware configuration could change the performance.

Thanks， Daniel

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/ryan-brazeal-ufl/OpenPyLivox/issues/3#issuecomment-635779350, or unsubscribe https://github.com/notifications/unsubscribe-auth/AIPRAGV37OJGI4RYM2KBXFDRT5GJDANCNFSM4K7NLE3A .

daniellukeahead commented 4 years ago

Hi Eddy, Did you mean writing to the binary file packet-by-packet instead of point-by-point? (i.e. writing as binary file every 100 points). Have you tested it for a long time acquisition?

V1.0.0 has a very smooth capture performance. The issue is that it keeps storing capture data into memory which only gets released until the data are written to file. So the memory will be eaten up quickly. It is suitable for long-time acquisition.

ryan-brazeal-ufl commented 4 years ago

Hi Eddy and Daniel, Many continued thanks for your interest in OPL and providing excellent feedback, ideas, criticisms, etc. It is a great motivator for me to find the time necessary to make code improvements, knowing others are using OPL (and possibly waiting for updates). I have released another patch update (v1.0.2), which technically speaking may be 'new' enough to qualify as a minor release, BUT ANYWAYS... The new functionality is based around collecting the point cloud data and writing it to a file using a simple binary format in real-time (i.e., not storing the observations in memory and then dumping them to a file after data collection has ceased). Unfortunately I am still NOT yet testing OPL on smaller computers (RPi, Odroid, etc.) so hopefully these improvements can be experienced on your computers. There is one new object method you will want to explore, .dataStart_RT_B(), and 2 replacement methods for .saveDataToCSV(...) and .closeCSV() which have now been deprecated. The replacements are .saveDataToFile(...) and .closeFile(). See section 4.6 in the Wiki for complete details and the demo project has also been updated. Lastly, a new library function .convertBin2CSV(...) has been added which, as the name suggests, converts the point cloud data stored in the simple binary file to the same CSV file format already being used within OPL. See section 5 in the Wiki for details. Hopefully it's not too buggy as I raced a bit trying to get this released for you guys to use. Cheers, Ryan

BingEdison commented 4 years ago

I can confirm writing packet to binary works great on pi 3 & 4 hardly any wait time , even at larger intervals of 10 secs

On Friday, May 29, 2020, Ryan Brazeal @ UFL notifications@github.com wrote:

Hi Eddy and Daniel, Many continued thanks for your interest in OPL and providing excellent feedback, ideas, criticisms, etc. It is a great motivator for me to find the time necessary to make code improvements, knowing others are using OPL (and possibly waiting for updates). I have released another patch update (v1.0.2), which technically speaking may be 'new' enough to qualify as a minor release, BUT ANYWAYS... The new functionality is based around collecting the point cloud data and writing it to a file using a simple binary format in real-time (i.e., not storing the observations in memory and then dumping them to a file after data collection has ceased). Unfortunately I am still NOT yet testing OPL on smaller computers (RPi, Odroid, etc.) so hopefully these improvements can be experienced on your computers. There is one new object method you will want to explore, .dataStart_RT_B(), and 2 replacement methods for .saveDataToCSV(...) and .closeCSV() which have now been deprecated. The replacements are .saveDataToFile(...) and .closeFile(). See section 4.6 in the Wiki for complete details and the demo project has also been updated. Lastly, a new library function .convertBin2CSV(...) has been added which, as the name suggests, converts the point cloud data stored in the simple binary file to the same CSV file format already being used within OPL. Hopefully it's not too buggy as I raced a bit trying to get this released for you guys to use. Cheers, Ryan

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/ryan-brazeal-ufl/OpenPyLivox/issues/3#issuecomment-636180231, or unsubscribe https://github.com/notifications/unsubscribe-auth/AIPRAGSDTJJ2IRDUQHIG6BLRUAL7LANCNFSM4K7NLE3A .

daniellukeahead commented 4 years ago

Perfect. I will have a try.

daniellukeahead commented 4 years ago

Hi Ryan, I tested it on windows PC for Mid-100. It works great! 100% good points captured.

I also did a test on Rpi4B for both Mid-40 and Mid-100. For Mid-40, as Eddy mentioned, it works great on Rpi 4 for long time capturing. 100% good points. However, for Mid-100, basically 3 Mid-40s, I got only around 52% good points for each of them.

I am not sure, maybe for Mid100, it demands more writing speed? If so, any idea to give it a further boost?

ryan-brazeal-ufl commented 4 years ago

Hello,

Thanks for the feedback. I appreciate your testing on the RPis. My current research is slowly but surely progressing to the point where I too will be optimizing OPL for RPi4 applications using the MID-40, MID-100, and Horizon sensors and the Livox Hub.

The current code is still doing validity testing on each observation BEFORE writing the observations to file (i.e., only good points are written to file). I will add the ability to bypass this testing which should (in theory) improve the real-time 'writing' performance. Then, I will also add the ability to perform the validity testing as part of the file conversion code (i.e., .convertBin2CSV(...)) which may increase the time required to convert the binary data (necessary evil), but the CSV file will still only include good points.

Daniel, could I ask you to try and collect long periods of data using your MID-100 again, but this time call the .setSphericalCS() method before starting data collection. Spherical point data (i.e., distance, zenith, azimuth) requires only 9 bytes to report each point observation, while Cartesian data (i.e., X,Y,Z) requires 13 bytes. I'm thinking this should also improve 'writing' performance. I will need to add the ability to the .convertBin2CSV(...) to convert the spherical point data to Cartesian, but that's easy enough.

Continued thanks,

Ryan

daniellukeahead commented 4 years ago

Hi Ryan,

Thank you. I like your idea to move the validity testing out of the loop. I just tested .setSphericalCS() method with Mid100 for 100s (I know it is not long enough, but since the duration is no more a concern...). There was a little bit of speed-up, not enough though. More total points(shown below), as well as null points, were captured.

Cartesian data

WeChat Image_20200604104953

Spherical data

WeChat Image_20200604105115

I saved the captured data to 3 different files. They need to be combined and reordered by timestamps in a single file. It will be more convenient if those functions can be added. Hope to see your continuous updates :-) Thanks!

ryan-brazeal-ufl commented 3 years ago

Hello, OpenPyLivox (OPL) has just been updated to v1.1.0 and now fully supports the Livox Mid-100 and Horizon sensors. It also supports LAS point cloud file creation. However, this release is only a stepping stone for my next research task of using an RPi with OPL to control and store data from a Mid-40 and also for a Horizon sensor. The throughput of point cloud data will be optimized further to ensure the RPi is able to handle the data transmitted by the sensors (especially the Horizon in dual return mode). I post further comments, updates to the project page (README.md). All the best, -RB

ryan-brazeal-ufl / OpenPyLivox

Poor performance #3

Cartesian data

Spherical data