mapbox / mapbox-gl-js

Interactive, thoroughly customizable maps in the browser, powered by vector tiles and WebGL
https://docs.mapbox.com/mapbox-gl-js/
Other
11.2k stars 2.22k forks source link

Use (shallow) merge semantics for `map.setStyle()` #6701

Open anandthakker opened 6 years ago

anandthakker commented 6 years ago

Motivation

Map#setStyle() currently does a deep equality comparison when diffing sources: https://github.com/mapbox/mapbox-gl-js/blob/a5a8dfe5b90737dbb295eaca7f41c667ae4060a8/src/style-spec/diff.js#L153

This is unnecessarily costly for styles using inline / dynamically-created GeoJSON data. And more generally, it's very common that the caller of map.setStyle() knows that the sources haven't changed.

See also @averas's comment here: https://github.com/mapbox/mapbox-gl-js/pull/3643#issuecomment-275741172

Design Alternatives

4006

4000

Design

Per the title, I propose that we do a shallow merge from map.setStyle(). In practice, this means that a user could omit omit any top-level field (sources, layers, etc.) of the style they're passing to map.setStyle(), and we would just use the existing style state for that field.

Mock-Up

map.setStyle({
   layers: [ ... new layers ...]
   // don't worry about setting sources:, sprite:, etc.
})
jfirebaugh commented 6 years ago

Hmm... my first reaction is pretty strongly against this. I think a lot of the benefit of smart setStyle comes from that fact that it's semantics are dead simple: you're replacing the thing wholesale without special rules like shallow merges or optional top-level properties.

What if we want to introduce non-required top-level properties, where it would be important to be able to distinguish between an intent to omit the value so that it's restored to the default value, versus an intent to retain the existing value via shallow merge? (In fact, light is arguably already such a property.)

anandthakker commented 6 years ago

I agree that we'd lose the dead-simplicity here, and that that's a real loss. But IMO, the bulk of smart setStyle's benefit comes from it allowing for declarative rather than imperative approach to runtime styling, and this wouldn't change that.

Worth noting that the proposed semantics here are what we chose to use for setFeatureState, and are analogous to React's setState.

What if we want to introduce non-required top-level properties, where it would be important to be able to distinguish between an intent to omit the value so that it's restored to the default value, versus an intent to retain the existing value via shallow merge? (In fact, light is arguably already such a property.)

Good point... I guess we'd have to do something like require an explicit null value.

andest01 commented 6 years ago

Good thread. I have two primary projects: one uses a large amount of embedded GeoJSON data, and the other project has heavier data-driven expressions. For both of these, setStyle can be very burdensome.

That being said, setStyle's "dead simple" approach was dramatically preferable to the spaghetti code of addSource vs setData vs setFilter while trying to maintain my layers' references to existing geometry.

How do we balance the simplicity of setState while avoiding the cost of cloning objects -- especially large GeoJSON objects?

Much of my code is written in React, and uses referential equality (===) to infer that nothing has changed. In other words, I lean towards an immutable style. Would it be possible to take advantage of this approach when deciding which elements to copy/clone/update?

anandthakker commented 6 years ago

Thanks @andest01.

For both of these, setStyle can be very burdensome.

Can you say more about what makes setStyle() burdensome in the case of "heavier data-driven expressions"? Is it that the expressions are large enough that the deep equality check is noticeably expensive?

uses referential equality (===) to infer that nothing has changed. In other words, I lean towards an immutable style. Would it be possible to take advantage of this approach when deciding which elements to copy/clone/update?

Yep, @jfirebaugh and I were discussing this yesterday and leaning towards such an approach. See also #4875

andest01 commented 6 years ago

Hi @anandthakker.

I'll talk a little more about setStyle and large data-driven expressions. My imagination is telling me that there may be more work than just cloning and evaluation -- information needs to be sent from system memory to a GPU and that is notoriously expensive -- but I'll try to provide you with a specific circumstance.

Suppose that you have a large amount of Admin 2 geometries -- say fewer than 5000 of them, distributed across a large area. These geometries are served from a tile server. The geometries have only an identifier, perhaps a four or five digit string, in their properties.

In order to match up your data that you're receiving in real time (as short as 500 ms per update) that has an ID and a piece of data, you'll likely need to use a data driven expression. The Mapbox Style Expression documentation is a bit.... shall we say, laconic, so you elect to use the literal dictionary approach. It's a big dictionary -- about five thousand id-value pairs, each with a specific color, but computers are fast these days, right?

const admin2Expression = [
  'get',
  ['to-string',
    [
      'get',
      'some_id',
    ],
  ],
  [
    'literal',
    {
      '01001': 'green',
      '01002': 'blue',
      '01003': 'greenish-blue',
      '01004': 'gray',
      // do this for about 5000 items 
    },
  ],
]

// use a data-driven expression in our paint property
const paintProperty = { 
  'paint': {
    'fill-color': ['coalesce', countyFillExpression, SOME_FALLBACK_COLOR_LIKE_GRAY_OR_WHATEVER],
  },
}

Also, given advanced visual design requirements (selection, highlighting, inner and outer borders), you'll need to create multiple layers of data-driven paint properties. Now you have multiple layers (say 5x), each with data driven styles backed by the same logic. These paint properties are created by unit-tested functions, and calling functions is easy, so your engineers merely shrug and re-use the dictionary multiple times, each for a given data driven paint property.

So let's back up for a second and take stock. Because the tile-server only stores IDs and our data changes in real-time, we're forced to somehow transcribe that data into a mapbox Style object. We're using the expression syntax of literal to create a very large paint property as a dictionary -- not an array, but a dictionary. Additionally, because of advanced visual requirements, we have re-used said dictionary multiple times. Working code is working code, after all, and we're just re-using a reference so how bad could it be?

You can probably see where this is going. :)

The result is that setStyle creates a very very blocking CPU call, and much of it appears to either be equality or serialization, or both. I'm not saying this is the right way to do it, nor the intended way of doing it. Just by typing all this stuff out, I had about three new ideas on how to do it differently (most of them hinging on using filters and arrays instead of paint properties).

The point is to help show you various ways that setStyle -- while extremely useful and darn close to ideal in terms of elegance -- but it can become very slow in extreme cases. I hope these posts help you isolate which cases those tend to be. My belief is that they're either very large GeoJSON files, or very large data-driven choropleth styles. I also accept a certain level of ignorance on my part. :)