PermafrostDiscoveryGateway / pdg-portal

Design and mockup documents for the PDG portal
Apache License 2.0
0 stars 0 forks source link

Create data layer of community names and locations #5

Open laurenwalker opened 3 years ago

robyngit commented 1 year ago

Mike Brook will send us some GeoJSON with community names and coordinates that we can add as a first version of this layer. Since the markers might crowd the map a little, it would be nice to eventually have some sort of raster layer with place names that could be layered over a base layer.

robyngit commented 1 year ago

Mike sent us a GEOJSON extract from our placenames database here

He said:

My intention is that you'll grab the GEOJSON and bring it into your system vs. hitting this URL continuously.

Note that it filters by latitude and population. The URL above gets all communities >=55 latitude with >=10000 people. I've also added the ability to loosen the population constraint for Alaska, since it would be extremely sparse otherwise (includes communities >=500 people in the example above). 55 latitude + 10K population seemed like roughly the right parameters to populate the map with a reasonable number of place dots, but you can fiddle with the parameters to get more/fewer.

I'm thinking this would make a worthwhile layer on the PDG map, since it would give a lot of context to the other layers. I'm not sure if searching layers is on the roadmap, but that would also be very helpful with this layer - e.g. you could search for "Noatak" and zoom to that part of the map, where you would probably then inspect whichever layers were of interest.

@mbjones - I think that it would make sense to store this geojson in the same way we store the image web tiles and cesium tiles (in the /var/data/tiles dir). However, this dataset does not have an associated record. Do you think that it might make sense to archive it on ADC? Otherwise, should I just upload it as a data object without associated metadata?

mbjones commented 1 year ago

Let's discuss. We should archive it on ADC if it is a valuable dataset that should be preserved and be citable, and which we are allowed to redistribute. If it is a temporary aid to visualization that doesn't have potential value for other users, then keeping it in the tiles directory would work.

julietcohen commented 1 year ago

The Climate Mapping for Resilience and Adaptation portal has this feature. The user guide specifies that users can search within 3 categories of geographies: Census Tract, County, or Tribal Land. I think the Tribal Land category would be a cool addition to the PDG search-by-community-name function, perhaps represented as "indigenous" rather than "tribal". This would not necessarily have to be a separate category from County or City as it is in the climate portal. It could also be a color differentiation in the name or bounding box around the region.

Perhaps indigenous communities are already included in the GEOJSON extract from our placenames database that Mike provided, which Robyn linked above. If so, we could add a property for the type of community.

julietcohen commented 1 year ago

Communities layer added to PDG demo portal

This Communities layer has been added to the demo portal as point geometries. The categorical color palette represents the country that each community falls within. When a user clicks on a certain community, the tabular data for that community is shown in the pop-up window.

We are waiting on feedback and a few details from Mike before we move it to the production portal:

  • data layer description (just a few sentences)
  • citation for the data
  • Is this dataset already archived somewhere? If not, I would be happy to help you archive it on the Arctic Data Center.
  • link for "Full Details" button. This can be a link to the archived dataset, or a LEO Network link, etc.
  • link for "Download Data" button. I currently have the raw geoJSON data linked here.
  • Do you have a dataset that represents the bounds of each community? Having polygon geometries, rather than points, would allow us to map the communities in a way that better shows overlap with other data layers.
julietcohen commented 1 year ago

Legend exploration: coloring communities based on population

The community populations are heavily skewed towards small populations, with just a few communities with very large populations. In the xml, we can assign a continuous palette to a property of the data, but we do not specify a palette name. Rather, we specify which color (CSS hex code) is assigned to each break, and the colors are mapped to the data as such. We can choose the amount of bins, and the bins do not have to be equally sized.

For data that is all positive values, it makes sense to use a monochromatic color palette, with the most intense color representing the largest populations.

When assigning bins for this skewed data, we need smaller bins for the smaller side of the data range, and the bins get much larger towards the larger end of the data range. That way, each color is represented in the map more equally. If we do not do this and simply make the bins equally sized, almost every flag would be the same color that represents the smallest bin, and the few flags that represent the largest populations would represent the color at the large extreme of the legend, and the few colors in between would be very sparsely represented.

In order to determine the best bins for this data, one approach is to winsorize the data to the 90th quantile (essentially removing the largest few values of the range) then create equally sized bins from those values. Then, we can create ~1-3 bins manually to represent the largest few values in the range.

For example, a green monochromatic color palette that splits the data into 6 bins looks something like this:

"colorPalette": {
                "paletteType": "continuous",
                "property": "population",
                "colors": [
                  {
                    "color": "#C0EC83",
                    "value": 501
                  },
                  {
                    "color": "#A7E074",
                    "value": 11916
                  },
                  {
                    "color": "#96CC39",
                    "value": 17368
                  },
                  {
                    "color": "#6BA32D",
                    "value": 29969
                  },
                  {
                    "color": "#547A1D",
                    "value": 58893
                  },
                  {
                    "color": "#3B5C0A",
                    "value": 103257
                  },
                  {
                    "color": "#0F4F34",
                    "value": 5028000
                  }
                ]
              },
image image

The smaller population values still need to be split into smaller bins in order to better differentiate between them.

julietcohen commented 1 year ago

Anna suggested the following 2 resources for polygon data, that were recommended to her by her colleague Greg Fiske:

Anna suggested that circles could represent the points, rather than flags. The circle size might represent the population. An example can be found here in the Populated Places figure.

I updated the demo portal Communities layer with the following:

To do: