Open lgtm70b opened 6 years ago
Hi @ddheart thanks for using the package and for providing a useful reprex. The short answer (to the best of my knowledge) is - not easily. There is a faster implementation of the algorithm which is O(n3) whereas this particular implementation is only O(n4). This is the reason it slows down quite noticeably as the number of assignments (polygons) increases. More information about the implementations can be found here and here. If you are able to provide a faster implementation then that would be great. Nonetheless, I hope that despite taking time, it is helpful for your use case.
I believe I was able to optimize the assignment algorithm and have submitted a pull request. After reading the c++ code, I realized the for
loop wasn't necessary. On the 50 US States, my variant gave results in ~200ms vs ~600ms with the existing implementation. (Intel i7-7700HQ @ 2.80 GHz). The assigning the 440 US Congressional Districts took 32 seconds. Assigning the 3000+ CONUS counties took ~34 hours.
Thanks for making this package! Also had run into a similar issue and found this q & a on SO using clue::solve_LSAP which is quite fast.
Here's a reprex using a larger shapefile with US congressional districts. Based on a single polygon taking 10 seconds to pass through hungarian_costmin(), this should take over 80 minutes to render the full cartogram. I've reduced the shapefile size from 65 MB to 3 MB already, but is there a way the hungarian algorithm can be optimized to run faster during grid assignment?