[CWM] Test large datasets using Leaflet:Renderer = Canvas Mode

lin-d-hop commented 2 weeks ago

Use Coop World Map project in Clockify

Description

@rogup has found a setting in Leaflet called Renderer. According to this article it can make map navigation faster on large datasets if the Renderer is set to Canvas Mode. Let's test this.

Acceptance Criteria

Using the fake data maps created in #254, apply this setting.

rogup commented 2 weeks ago

To be able to test this, I had to get the map working for me on Chrome, which was hitting a RangeError: Maximum call stack size exceeded. This was due to us trying to apply the min and max function over a huge array, and was trivial to reimplement in a way that uses less memory.

I also decided to do a bit of profiling like @wu-lee did in #256 so I wanted to learn what on earth all those graphs mean and how to use them, and the load was taking so long to test the above. Sorry, probably a bit out of the scope of this ticket but it was fun and useful.

I found the 'flame graph' the most useful way to find bottlenecks. A couple of super low hanging fruit seemed to be all the individual addLayer calls which happen for each marker created, as @wu-lee found, and also the array.splice() call which gets called to do a sorted insert for every initiative that is parsed by PapaParse (see below)

After doing all of these small fixes, the 500K map was able to load pretty quickly for me in Chrome (~9 seconds for initial map load but then a further 28 seconds for all the markers to appear). The map could be navigated fairly smoothly, even without the canvas renderer. See this screen recording: https://drive.google.com/file/d/1UJlmeSqtOlGXYP43pD8jsGSlhUrAAnCu/view?usp=sharing

Finally, I enabled the canvas renderer, which maybe had a slight improvement, but hardly noticeable. I think this is because the way we display markers using Leaflet.AwesomeMarkers isn't actually compatible with canvas mode so all the markers still have a DOM element.

I think the next step is for others to check how usable the 500K map is after my optimisations. It might just be that it's usable because my computer is quite powerful. If not, then we need to make some bigger changes. An option would be to actually use canvas rendering properly, by using a library that supports this e.g. https://github.com/eJuke/Leaflet.Canvas-Markers

I think it's also clear that if we stick with Leaflet, we need to change the clustering library, since a single bulk addLayers call is still taking 28s (and doing lots of addLayer calls behind the scenes, according to the profiler).

rogup commented 2 weeks ago

Here is the Mykomap branch where I made the above changes https://github.com/DigitalCommons/mykomap/tree/cwm-test-maps-optimisation

I'll make a hook runner target to deploy this to https://dev.maps.solidarityeconomy.coop/experiment/cwm-test-maps/ as we continue testing and trying out things

lin-d-hop commented 2 weeks ago

Thanks for all of this @rogup! Awesome!

So just to clarify: 1) You changed to a single bulk addLayer() call 2) You changed array.splice() calling so that it doesn't sort every parse.

Its a good improvement but still far too slow.

Am I right that the next thing to try is to swap out AwesomeCluster for SuperCluster? Is this something we can try easily now that the single bulk addLayer() call update has happened? If so it would be great to try this asap.

Are there any other optimisations that either: a) You've done and I didn't note above or b) You've spotted and haven't done yet?

Thanks!

wu-lee commented 2 weeks ago

To be able to test this, I had to get the map working for me on Chrome, which was hitting a RangeError: Maximum call stack size exceeded. This was due to us trying to apply the min and max function over a huge array, and was trivial to reimplement in a way that uses less memory.

Curious where that change is? Can't see it in Mykomap or cwm-test-maps source, presumably in the former though?

[edit: ignore this - problem with git fetch not updating all branches as expected]

rogup commented 2 weeks ago

@lin-d-hop

My changes were:

Remove all the addLayer() calls when each individual marker is loaded, only keeping a single bulk addLayers() call (which we already also did at the end of processing for some reason). Big perf gains but still takes 28s
Remove sorted insert that used array.slice() for every parsed initiative. Now we just array.push() and then do an array.sort() at the end. Big perf gains
Make canvas rendering the default for Leaflet. Not sure if this made a difference. All the markers seem to still have DOM elements
Implement a different method of finding the lat-lng bounds of the initiatives No change to time perf, but uses less memory so it works on more browsers

This flame graph gives a really nice view of the things that are taking a long time during the load (~40s in total): https://share.firefox.dev/3W39IFc

Based on that, other potential optimisations I can see are, in the order that I think we should investigate them are:

Replace leaflet.markercluster with SuperCluster as per https://andrejgajdos.com/leaflet-developer-guide-to-high-performance-map-visualizations-in-react/ . I'm not sure how simple it will be to replace this, I think @wu-lee did a bit more investigation into this. Maybe there's a way for us to try it quickly as a very rough test. Potentially 30s gain
Refactor Marker UI code. A lot of the load time during parsing of the CSV is spent creating a JS object for every marker. This may also make navigation around the map slow, especially on phones and slow devices. There are probably different options here and it needs more investigation. One option would be to use a different marker library (e.g. https://github.com/eJuke/Leaflet.Canvas-Markers) so we can actually use canvas rendering and not make so many DOM elements. Potentially 3s gain
Move some processing after the initial map load or to the data pipeline. e.g. maybe we can pre-sort initiatives in the CSV, pre-generate a cluster index (if SuperCluster still takes a while), and gradually populate the map with markers as they are loaded. Potentially 4s gain

I think that all 3 are probably necessary to some degree, if the target is <3s load time. And I don't think changing to Mapbox will be a magic fix since a lot of these optimisations are related to other parts of the code.

wu-lee commented 2 weeks ago

Ok, have done a little testing on my desktop.

Loading is much, much faster - great! But as observed it still takes quite a long time to load.

trial 1: 250k - ~30secs to map appearing (no profile console)
trial 2: 500k - ~45secs to map appearing (no profile console)
But this is not repeatable? - A later trial of 500k took 3+ minutes (no profile console) and a trial with the console open took longer than I was prepared to wait for (probably over 5 mins, although the network console reporting a very slow minutes-long CSV file load made me wonder if everything was working normally, see screenshot below)
Also - the initial load seems to have incomplete clustering. If I wait a bit, the numbers in the clusters update with much larger numbers (earlier numbers in hundreds, later in thousands). Takes maybe another 30secs?
trial 5: 500k - ~70secs to map appearing, but pins totals in 100s (too low), and after 8+ minutes they've still not updated, computer's drive light is continually flashing, and the tab appears to have become unresponsive
if I do get to stage where I can pan and zoom it's a bit jerky

Will do more testing, see if I can work out why it's different between retrials, and try to get the profiler working. There might be some difference between reloading the tab, simply navigating back and forward, opening in a fresh tab, etc.

One of @rogup's optimisations was to comment out the per-initiative addLayer() calls, which AFAIK is how the pins are added - as well as the contents of MarkerManager.updateVisibility() method (amongst other things). Not sure of the consequences of that offhand, a bit surprised there were pins at all, and presumably this means the filtering won't be able to update the visible pins. Perhaps that's making the initial load faster but by shunting the pin-adding and slow cluster computation until later, when there's a map refresh?

[Image below is of weirdly slow CSV load time mentioned above]

rogup commented 2 weeks ago

@wu-lee I've tried quite a few times, removing cache in between the tries, and I haven't hit the slow CSV problem. I'm also not seeing much of a slowdown when using the profiler. Maybe my Mac is a bit faster and better at handling this.

Yes, I expect updateVisibility will need to be fixed so that filtering works. I didn't bother doing this since this for this barebones test, since we don't need that functionality yet.

If you search addLayers(, you'll see where all the pins get added after loading. This method seems to be asynchronous and takes ~30s as you noticed, which is why I think the next step would be to try SuperCluster.

Do you think the jerkiness of moving around the map is good enough? It felt ok to me but maybe it's not good enough on all computers. If we decide we need to improve this, then there's a big decision to make:

Stay with Leaflet and try canvas rendering (using https://github.com/eJuke/Leaflet.Canvas-Markers)
Move to Mapbox and use their symbol layer (https://docs.mapbox.com/style-spec/reference/layers/#symbol) which renders on the WebGL canvas. I'm not entirely sure how this would work with clustering, but a quick web search suggests it's possible

I would probably lean to trying option 1 with a time box, since it's probably much faster. But option 2 does have the advantage of making MM more consistent with LX.

DigitalCommons / mykomap

[CWM] Test large datasets using Leaflet:Renderer = Canvas Mode #257

Description

Acceptance Criteria