python-visualization / folium

Python Data. Leaflet.js Maps.
https://python-visualization.github.io/folium/
MIT License
6.9k stars 2.22k forks source link

GeoJson Filtering #936

Closed jtbaker closed 1 year ago

jtbaker commented 6 years ago

Got some ideas for further GeoJson functionality.

Leaflet's GeoJson class has several keys in its callback that we're already utilizing to some extent. Leaflet docs One that is available, but which we're not yet using, is filter. It wants a function that will return a boolean for each feature in the GeoJson.

An idea I had, that I thought might be useful, would be to utilize this filtering after we've loaded the GeoJson into the browser. Perhaps most significant would be allowing to filter on or off features that have a value in their properties. This would be most useful for some kind of ordinal/discrete data that can be grouped together into a Set - not so much for quantitative data field where properties will have a unique value.

Unfortunately, filter is not available as a normal class method - it's only available in the GeoJson callback. This means that to redraw, the original L.geoJson object needs to be removed from the map and readded, with a new filter subset in the callback. This isn't optimal, but would allow us to do some more dynamic display methods, for instances when it might not make sense to add an entire separate layer to the LayerControl.

Here's a fiddle I made as a little use case scenario. It looks a more complex than our use case would be, since I'm loading the GeoJson asynchronously with a fetch call.

Let me know what you think - if it's feasible and worth spending time to try and incorporate, or outside the scope of the project.

Conengmo commented 6 years ago

I looked at the filter option in geojson and it does look useful. To be honest I don't regularly use geojson files, but I can image there are use cases for this option. Unfortunately your fiddle only showed me two gray rectangles.

I don't really get what you mean by filtering after loading. How I understand it is that any feature that doesn't pass the filter is not displayed. And that this is not dynamic, but done once on page load.

Is this not trivial to do in Python by the way? Looping over the features and removing them (or something like that) when they don't pass a certain test.

I guess I'm missing something, maybe you can explain a bit more? How are we going to use these filters dynamically?

jtbaker commented 6 years ago

@Conengmo the fiddle grabs a pretty large GeoJson (like 8MB) from here to load, so depending on your connection speed, might need a minute to get loaded into the browser. Does your browser support fetch HTTP requests?

The DOM elements are built dynamically from the dataset, so if it's not being retrieved with an HTTP 200 the page content might not be built properly.

The basic premise is to have the option of adding subclasses or categories of a dataset to allow users of a Map to perform their own selection dynamically from. In this instance, there is a complete road network dataset, that is further classified into four 'comfort levels', along with the speed limit data. I thought a UI improvement to this would be to allow for further selection/unselection within the dataset based on some user input from the DOM (they are HTML <input type='checkbox' value=<category>>, with Event Listener functions that modify a variable Array of values, which every time they change trigger a function that removes the existing GeoJson layer variable form the map, rewrites it to one that gets added to the map and filtered based based on the contents of the Array.

I thought this could be useful because as I understand it, GeoJsons are not supported in FeatureGroupSubGroup - but they are a great way to keep setsdata organized together.

This is the crux of the admittedly crude, but plain JS logic with no libaries/frameworks:

  var data = await fetch(url).then(response => response.json())

  var values = [...new Set(data.features.map(feature => feature.properties['STATUS']))];

  options = Array.from(values);

  for (var record of options.sort((a, i) => a - i)) {
    var d = document.createElement('div')
    var i = document.createElement('input');
    i.type = 'checkbox';
    i.value = record;;
    i.checked = true;
    i.addEventListener('click', function(event) {
      var val = event.target.value;
      if (options.includes(val)) {
        options = options.filter(item => item != val)
      } else options.push(val)
      makeAMap(data, options)
    })
    var l = document.createElement('label');
    l.appendChild(i);
    l.appendChild(document.createTextNode(record));
    d.appendChild(l);
    document.getElementById('checks').appendChild(d);
  };

makeAMap is just a function that removes the old and adds a new L.GeoJson feature to the map with filter: feature => vals.includes(feature.properties['STATUS']).

This is more interaction in the DOM outside of the direct scope of Leaflet, and I'm still kind of learning front end development. Usability wise, I think this would be best implemented as a nested 'Tree' structure in the Leaflet LayerControl. But I'm unsure if there is any kind of support for such a structure from Leaflet - I haven't seen one so far.

Conengmo commented 6 years ago

Interesting idea Jason. To be honest I'm not that proficient with Javascript but I understand now what you're trying to do. If I got it right you want to be able to select which features of a GeoJson object to show based on variables in the geojson data. Radio buttons provide the user a way to toggle this dynamically.

I can't really say if the approach you're describing makes good sense or not. I'm thinking that it sounds a bit over-engineered to rebuild the GeoJson feature on each toggle.

If we already know which geojson features have which values, can't we put them as separate layers in LayerControl from the start? Maybe even with using FeatureGroupSubGroup. We'd have to split the provided geojson in separate GeoJson instances.

What do you think about the trade-off where stuff should happen, in Python or Javascript?

jtbaker commented 6 years ago

Yes - that's definitely a possibility - although I thought I saw an issue a few months ago where someone was having issues using GeoJsons and FeatureGroups together. Found it in #904. I'm not sure if this would affect other GeoJson functionalities like tooltipping.

I think the biggest issue (feedback from map users I've gotten) with the current FeatureGroup/FeatureGroupSubGroup implementation is that it's not clear to users which 'children' layers belong to which 'parents', because their input checkboxes are indented at the same level - I think a nested tree structure would make more clear which elements belonged to which.

I did some more research and found a repo that looks like a more intuitive UI implementation of the LayerControl: https://github.com/ismyrnow/leaflet-groupedlayercontrol.

Perhaps we could accept a parent_group kwarg to our map layers, and write a new GroupedLayerControl plugin to extend the current LayerControl functionality, grouping the child layers into dictionaries with parent keys.

jtbaker commented 6 years ago

Philosophically, design pattern wise, I think as much of the logic should execute in the JavaScript client side as possible. If we're doing stuff on the Python side, I think that sends us down a road of embedding logic into our data - like style_function and highlight_function are implemented right now, which I don't think is the best case scenario and will limit folium's ability to scale to larger projects as the size of the data bloats, and will not run as performantly.

I like writing Python better and think it is cleaner - but would like to optimize and factor these as good as possible.

I looked into Transcrypt but it doesn't seem like it is so simple to parse Python functions into JS that could be dropped into the template cleanly. Do you have any experience with such libraries?

Conengmo commented 6 years ago

I think as much of the logic should execute in the JavaScript client side as possible

Our users operate in Python, so that's where we want to provide error messages. We cannot get messages back to the user from Javascript. Any data should be first handled in Python, so we can let the user know when the input is wrong. And new, unexpected errors will be easier to find and report when they happen in Python.

Another point is that we don't have a good way to test the rendered Javascript, so any logic moved to Javascript is left out of the tests. This is something that should be solved, but I think it's still a valid point that our Python test suite works easier with Python code.

will not run as performantly

We need to learn more about how to optimize for Javascript client-side performance.

embedding logic into our data

The problem with that is in Python, when a user expects his geojson dictionary to remain unchanged. That is easily solved by copying it before adding the styling. Currently, folium fully relies on embedding all data. It would require some rethinking to work with external data sources better without embedding.

looked into Transcrypt

Interesting, I never heard of that one before. I don't really see on first glance how this would be useful, since we already work with templates.

What I like about the current style_function and highlight_function is that they allow a user to write a function in Python. I would rather that our users don't have to write a single line of Javascript. If they were proficient at Javascript, they likely wouldn't be using folium.

jtbaker commented 6 years ago

What I like about the current style_function and highlight_function is that they allow a user to write a function in Python.

I like this too. It allows the user to iterate through each feature in the GeoJson and customize its styling. I just wish that this could be done without mutating the data, since we loop through the features again client side in .setStyle, which we could use to query against properties and return a styling dictionary.

Ideally, something that could convert

def style(feature):
    obj = {'color':'black','weight':2,'fillOpacity':0.8}
    if feature['properties']["Majority Party"]=='Republican':
        obj.update({'fillColor':'red'})
    elif feature['properties']["Majority Party"]=='Democrat':
        obj.update({'fillColor':'blue'})
    return obj

into a string:

"""
function(feature){
var obj = {'color':'black','weight':2,'fillOpacity':0.8}
if(feature['properties']['Majority Party']=='Republican')
{obj=Object.assign({"fillColor":"red"},obj)}
else if(feature['properties']['Majority Party']=='Democrat')
{obj=Object.assign({"fillColor":"blue"},obj)}
return obj
}
"""

to pass to onEachFeature in the callback via a Jinja2 {{ variable embed }}.

this doesn't seem too complex, at least with basic if/else and bitwise comparison operators, and we could use dictionary/Object key:value coupling, but I'm having a hard time finding something that could accomplish such a translation task dynamically on the fly. I think we could still do some feature.properties[key] validation Python side.

I know this is kind of pie in the sky, I've just gotten into more of the functional programming school of thought lately - and in a perfect world our functions would be able to evaluate without mutating the state of our data.

In my mind, GeoJson is a straightforward, easy to use, standardized format for data interchange that can be used by a variety of applications - users of folium Maps could share their data in this format - but it's now convoluted/confused because it contains elements of our application logic that aren't directly related to the data - only how we are representing it.

Going to do some more research on this and see what I can find and think of.

satian7 commented 6 years ago

I am working on sort of a similar map but my data is completely in geojson. Is there a way that I can add a drop-down to zoom on a point on the map using the json[feature][properties][name] as a key from the Geojson?

jtbaker commented 6 years ago

Isn't that what the Search plugin does right now?

http://nbviewer.jupyter.org/gist/jtbaker/0dae53b5d0fac92f67cbfd46379087ad

satian7 commented 6 years ago

Yes, it does but it conflicts with the marker cluster. I guess there is no way to set those markers off.

Conengmo commented 5 years ago

Hi @jtbaker, I’ve been rereading this thread and that layer control grouping plugin you mentioned looks promising. https://github.com/ismyrnow/leaflet-groupedlayercontrol

Could this be a better and more general way than featuregroupsubgroup?

Conengmo commented 1 year ago

We did merge the grouped layercontrol plugin recently: https://github.com/python-visualization/folium/pull/1592.

I think extensive GeoJSON filtering on the front-end side is out of scope for folium.

Instead, example notebooks on how to split your geojson data and use plugins like feature-group-sub-group and grouped-layer-control would be welcome.

I'll close this issue since I don't think much else will come from it. If somebody wants to reopen this discussion, feel free to comment here, or maybe better to open a new issue.