OpenDroneMap / OpenSfM

Open source Structure from Motion pipeline
GNU Affero General Public License v3.0
19 stars 38 forks source link

added lru caching for faster undistortion #22

Closed originlake closed 1 year ago

originlake commented 1 year ago

To improve processing speed for undistortion https://github.com/mapillary/OpenSfM/pull/1007

pierotofy commented 1 year ago

Thanks @originlake , (sorry it took so long to review this), I think there's some code in this PR that is not related to the proposed LRU-caching improvement?

Could you also post information about the expected speed-up? (How much faster does undistortion happen with the proposed changes?)

originlake commented 1 year ago

When compiling opensfm c++ module in debug mode, there will be a linking error without the change in opensfm/src/robust/src/similarity_model.cc. I can remove it if not needed. But it is a problem when trying to debug the c++ modules.

Some time ago this branch introduced imageFilter argument but Opensfm's command line tool is not updated accordingly, if you run bin/opensfm undistort dataset_path in command line, it will raise an error due to this. The command line tool for undistortion is not used in ODM so it's not a problem, I fixed it because I'm using the command line tool for debugging. https://github.com/OpenDroneMap/OpenSfM/blob/589816e05d452c9a51148fb661a98802419e0776/opensfm/actions/undistort.py#L9-L17

About performance, ideally, the time cost of camera_mapping calculation is around 2 times the following remap. So in a single-threaded processing, you can expect around 60% time cost reduction. But in multithreading processing, it gets complicated. Each thread takes a lot longer compared to single-threaded processing and I can't observe consistent speed improvement there. I guess it is bottlenecked by other stuff like CPU caching, one image could easily fill up the L3 cache, and there are multiple images racing to consume the cache, the CPU is less efficient even if it has multiple cores running. I can't give a good measurement as it's more dependent on RAM speed, cache size, image size, etc. In my test environment, with 5000x4000 images and 8 cores, I can only observe a 10 to 20% time cost reduction.

pierotofy commented 1 year ago

That's all right, was just trying to understand the scope of the changes. No modifications needed. Thanks for fixing these issues as well as the LRU cache improvements.

I will test this more thoroughly in the upcoming days and merge it afterwards.

pierotofy commented 1 year ago

Looks great, thanks @originlake ! :pray: