OpenWaterFoundation / owf-app-infomapper-ng

Open Water Foundation InfoMapper web application for menu-driven maps and visualizations, using Angular
GNU General Public License v3.0
1 stars 2 forks source link

Implement graduated classification for raster layer #349

Closed smalers closed 2 years ago

smalers commented 3 years ago

Need to implemented graduated classification. Currently, single symbol and category classification are enabled. However, for some layers, such as the Colorado Agricultural Water Transfer raster map, the use of category classification, the legend is too long and verbose. The graduated classification will be similar to category classification except that there will be two value columns to indicate the range of values. I need to confirm the standard for this in QGIS and ArcGIS. The color table can be specified in a csv. There are also "color ramps" that could be auto-generated, such as gray scale. I implemented something in Java years ago and can review. I'll take a look at it and then coordinate with Josh.

smalers commented 3 years ago

I reviewed the map configuration file specification and the documentation describes the functionality that is needed, for vector and raster layer. Hopefully this is relatively easy to implement given that it just involves checking two numbers instead of one for the symbol. Once it is implemented I may have feedback on default labeling for the range based on comparison with QGIS and Esri. The graduated classification definitely needs to work on numerical values now and in the future might be implemented for strings using some type of string sort order comparison.

Nightsphere commented 3 years ago

The first push has been completed, and there are a few things to keep in mind for this first iteration of the graduated classification.

  1. The graduated classificationType will only work for Raster layers at the moment. There were a few bugs to iron out, but implementing for a vector layer should go much quicker, and I will implement later. I at least wanted to get something out first, and then feedback can be used for when I create for vectors.
  2. There is no default right now for Graduated legends; a classification file must be given.
  3. If the geoLayerSymbol attribute classificationType is set to Graduated, the geoLayerSymbol attribute classificationAttribute will be used to determine the raster band as before. The geoLayerSymbol property classificationFile will be used for the path to the file.
  4. Here is an example of the geoLayerSymbol I used in my testing:
    "geoLayerSymbol": {
    "name": "Colorize municipalities",
    "description": "Symbol for the municipality raster",
    "classificationType": "Graduated",
    "classificationAttribute": "7",
    "properties": {
        "classificationFile": "../map-classification-files/Municipal_Growth-classify-grad-year.csv"
    }
    }
  5. Here is the classification file used:
    
    # Graduated classification table for Municipal Land Development raster.
    # - the value corresponds to raster data value for year
    # Value from attribute is compared as follows to determine symbol:
    #  value >= valueMin
    #  value < valueMax
    valueMin,valueMax,color,opacity,fillColor,fillOpacity,weight
    # 2018 Current - Black (might change to gray)
    -Infinity,2019,#000000,1.0,#000000,0.3,2
    # 2019-2020 Near-term red
    2019,2021,#ff0000,1.0,#ff0000,0.3,2
    # 2021-2025 red-orange
    2021,2026,#ff4500,1.0,#ff4500,0.3,2
    # 2026-2030 orange
    2026,2031,#ffa500,1.0,#ffa500,0.3,2
    # 2031-2035 orange-yellow
    2031,2036,#f8d568,1.0,#f8d568,0.3,2
    # 2036-2040 yellow
    2036,2041,#ffff00,1.0,#ffff00,0.3,2
    # 2041-2045 yellow-green
    2041,2046,#adff2f,1.0,#adff2f,0.3,2
    # 2046-2050 green
    2046,Infinity,#00ff00,1.0,#00ff00,0.3,2

6. Since the valueMax values are exclusively checked, they are 1 more than the year that will be displayed in the legend. When showing the valueMax year in the legend, I just show the (year - 1) to users.
smalers commented 3 years ago

I did a bit of research and found the example below. I have not had time to review Josh's work but hopefully have time later today. Specific comments are:

  1. Use the default label formatting shown.
  2. Detect the order of labels high to low from the classification file and show the same in the legend.
  3. For out of range error maybe accept * for the values and the labels will be user-supplied. Default to black?
  4. If a custom label text is used, allow property like ${ValueMax} and ${ValueMin} in the label. Default is to use just the numbers as shown in the figure below.

https://experience.arcgis.com/experience/fb52d598982f41faac714b5ebe32e7d1

image

Nightsphere commented 3 years ago

Another push has been merged with updates to how the legend displays each group so that it more closely resembles the example above. There is still quite a bit to do with testing with single band and multi-band rasters and vector layers using graduated classification. I'm not convinced I have good test files for graduated testing all of these; maybe I can get some better layers from Steve later.

My testing has been with the multi-band municipal raster layer, as the years make obviously good use of the graduated classification. For a single band however, I only have water districts, which doesn't work very well.

Nightsphere commented 3 years ago

Yet another merge and push has been made. This implements the asterisk ( * ) for valueMin or valueMax and the ${property} notation for a user-defined label. Here is the classify file I used:

valueMin,valueMax,color,opacity,fillColor,fillOpacity,weight,label
# 2046-2050 green
2046,Infinity,#00ff00,1.0,#00ff00,0.6,2,> ${valueMin}
# 2041-2045 yellow-green
2041,2046,#adff2f,1.0,#adff2f,0.6,2,${valueMin} - ${valueMax}
# 2036-2040 yellow
2036,2041,#ffff00,1.0,#ffff00,0.6,2,${valueMin} - ${valueMax}
# 2031-2035 orange-yellow
2031,2036,#f8d568,1.0,#f8d568,0.6,2,${valueMin} - ${valueMax}
# 2026-2030 orange
2026,2031,#ffa500,1.0,#ffa500,0.6,2,${valueMin} - ${valueMax}
# 2021-2025 red-orange
2021,2026,#ff4500,1.0,#ff4500,0.6,2,${valueMin} - ${valueMax}
# 2019-2020 Near-term red
2019,2021,#ff0000,1.0,#ff0000,0.6,2,${valueMin} - ${valueMax}
# 2018 Current - Black (might change to gray)
-Infinity,2019,#000000,1.0,#000000,0.6,2,< ${valueMax}

Note the label string does not need double quotes around it unless it contains a comma in it.

Something else I noticed was Steve's ValueMin and ValueMax are capitalized. When Papa Parse reads in the file, it uses the each header as the key in the object, so to match the property name, I changed it to use lower case instead. This can theoretically be changed if the valueMin and valueMax headers are capitalized as well.

Maybe what I can do in the future is update the object after it's been created by Papa Parse and just change all keys to be uppercase, then convert whatever's in the property notation to uppercase as well when I compare.

smalers commented 3 years ago

I'm using the Municipal_Growth.tif layer (emailing separately). The following does not work (no labels) and the raster is not drawn on the map so something is broken. I don't see any console messages.

I'm also wondering if the behavior for integers needs to be different from floating point numbers. For example, for integers, it does not make sense to repeat the maximum value on one line and the minimum value on the next line. The commented lines below illustrate this. Should the operators be included in the value columns? There is the matter of configuring the values so that software knows what comparisons to make.... and intelligently using the configuration to create the legend. I don't know how complex the code will be to provide the flexibility. It should be possible for code to detect the data type and perhaps default, or it may be necessary to explicitly indicate the configuration information.

# Category classification table for Municipal Land Development raster.
# - the value corresponds to raster data value for year
# - set colors to indicate red for most immediate development pressure
valueMin,valueMax,color,opacity,fillColor,fillOpacity,weight,label
# 2018 Current - Black (might change to gray)
#-Infinity,<=2018,#000000,1.0,#000000,0.3,2,"<= 2018"
#-Infinity,2018,#000000,1.0,#000000,0.3,2,"<= 2018"
-Infinity,2018,#000000,1.0,#000000,0.3,2
# 2019-2020 Near-term red
#>=2019,<=2020,#ff0000,1.0,#ff0000,0.3,2
2018,2020,#ff0000,1.0,#ff0000,0.3,2
# 2021-2025 red-orange
#>=2021,<=2025,#ff4500,1.0,#ff4500,0.3,2
2020,2025,#ff4500,1.0,#ff4500,0.3,2
# 2026-2030 orange
#>=2026,<=2030,#ffa500,1.0,#ffa500,0.3,2
2025,2030,#ffa500,1.0,#ffa500,0.3,2
# 2031-2035 orange-yellow
#>=2030,<=2035,#f8d568,1.0,#f8d568,0.3,2
2030,2035,#f8d568,1.0,#f8d568,0.3,2
# 2036-2040 yellow
#>=2036,<=2040,#ffff00,1.0,#ffff00,0.3,2
2035,2040,#ffff00,1.0,#ffff00,0.3,2
# 2041-2045 yellow-green
#>=2041,<=2045,#adff2f,1.0,#adff2f,0.3,2
2040,2045,#adff2f,1.0,#adff2f,0.3,2
# 2046-2050 green
#>=2046,<=2050,#00ff00,1.0,#00ff00,0.3,2
2045,2050,#00ff00,1.0,#00ff00,0.3,2

image

I also tried the following. I tried with and without quotes around the label. To handle integers explicitly, a modifier function could be used such as ${valueMin}.plus(1) but not sure if that is the best approach.

# Category classification table for Municipal Land Development raster.
# - the value corresponds to raster data value for year
# - set colors to indicate red for most immediate development pressure
valueMin,valueMax,color,opacity,fillColor,fillOpacity,weight,label
# 2018 Current - Black (might change to gray)
#-Infinity,<=2018,#000000,1.0,#000000,0.3,2,"<= 2018"
#-Infinity,2018,#000000,1.0,#000000,0.3,2,"<= 2018"
-Infinity,2018,#000000,1.0,#000000,0.3,2,"<= ${valueMax}"
# 2019-2020 Near-term red
#>=2019,<=2020,#ff0000,1.0,#ff0000,0.3,2
2018,2020,#ff0000,1.0,#ff0000,0.3,2,"> ${valueMin}-${valueMax}"
# 2021-2025 red-orange
#>=2021,<=2025,#ff4500,1.0,#ff4500,0.3,2
2020,2025,#ff4500,1.0,#ff4500,0.3,2,"> ${valueMin}-${valueMax}"
# 2026-2030 orange
#>=2026,<=2030,#ffa500,1.0,#ffa500,0.3,2
2025,2030,#ffa500,1.0,#ffa500,0.3,2,"> ${valueMin}-${valueMax}"
# 2031-2035 orange-yellow
#>=2030,<=2035,#f8d568,1.0,#f8d568,0.3,2
2030,2035,#f8d568,1.0,#f8d568,0.3,2,"> ${valueMin}-${valueMax}"
# 2036-2040 yellow
#>=2036,<=2040,#ffff00,1.0,#ffff00,0.3,2
2035,2040,#ffff00,1.0,#ffff00,0.3,2,"> ${valueMin}-${valueMax}"
# 2041-2045 yellow-green
#>=2041,<=2045,#adff2f,1.0,#adff2f,0.3,2
2040,2045,#adff2f,1.0,#adff2f,0.3,2,"> ${valueMin}-${valueMax}"
# 2046-2050 green
#>=2046,<=2050,#00ff00,1.0,#00ff00,0.3,2
2045,2050,#00ff00,1.0,#00ff00,0.3,2,"> ${valueMin}-${valueMax}"

image

Nightsphere commented 3 years ago

The changes have been merged, which should display the legend and raster layer on the map. The only time I could recreate the ${property} values not being formatted is if the geoLayerSymbol's classificationType attibute is not changed to Graduated. I wasn't able to find a good way to determine if a user wants the classification type to be graduated or not. Maybe I can look at the name of the classification file and if it has grad or graduated in it and if the classificationType is Categorized, then write an error message?

At first I was trying to check if the ${property} notation was trying to be used in the legend, but if we wan't to implement that notation to be used in Categorized classification files later that wouldn't help too much.

smalers commented 3 years ago

OK, that was my mistake. I totally forgot to change the classification type. Checks that could be added are:

  1. If doing a categorized classification and the delimited file does not have column value, print a warning to console.
  2. If doing a graduated classification and the delimited file does not contain columns named valueMin and valueMax, print a warning to console.

I'll call in a bit to discuss what we can do for integer rather than floating point handling. Maybe do the following:

  1. Allow comparison operator to be put in front of each valueMin and valueMax to indicate what comparison should be done, such as >, >=, =, <, <=.
  2. The default for valueMin for integer would be >= and > for floating point.
  3. The default for valueMax for integer would be <= and also for floating point.
  4. Handle the endpoints accordingly.

I don't know if there is a foolproof way to know the data type so maybe default to floating point and allow the operators to be specified to handle integers?

Nightsphere commented 3 years ago

The checks for value and valueMin and valueMax have been added, and warning messages will appear accordingly. User-given operators can also be provided, with or without a space in the valueMin or valueMax properties in the classification file.

smalers commented 3 years ago

The functionality is generally working. However, defaults for labels need to be changed. Below is my classification file and corresponding legend. There are some redundant operators. It seems like the logic should be something like:

  1. When handling the operator for values:
    1. If valueMin operator is not specified, assume an operator > for any number.
    2. Else if valueMin operator is specified, use it.
    3. If valueMax operator is not specified, assume an operator <= for any number.
    4. Else if valueMax operator is specified, use it.
  2. When drawing the labels in the legend:
    1. For default legend, if an operator includes = (or equivalent), then there is no need to show an operator.
    2. For valueMin if the operator does not include the = (or equivalent) show the operator to the left of valueMin
    3. For valueMax if the operator does not include the = (or equivalent) show the operator to the left of valueMax
    4. I like the handling of Infinity of showing only one value as long as it is consistent with the above.

Therefore, if the operators are >= and <= (such as for integers), the label would show minValue - maxValue. If the operators are > and <= (such as for floating point), the label would show: > minValue - maxValue.

For now, let's rely on custom labels for integers but continue to try to deal with detecting data type. I will update the documentation to explain these issue.

# Category classification table for Municipal Land Development raster.
# - the value corresponds to raster data value for year
# - set colors to indicate red for most immediate development pressure
valueMin,valueMax,color,opacity,fillColor,fillOpacity,weight,label
# 2018 Current - Black (might change to gray)
-Infinity,<=2018,#000000,1.0,#000000,0.3,2
# 2019-2020 Near-term red
>=2019,<=2020,#ff0000,1.0,#ff0000,0.3,2
# 2021-2025 red-orange
>=2021,<=2025,#ff4500,1.0,#ff4500,0.3,2
# 2026-2030 orange
>=2026,<=2030,#ffa500,1.0,#ffa500,0.3,2
# 2031-2035 orange-yellow
>=2031,<=2035,#f8d568,1.0,#f8d568,0.3,2
# 2036-2040 yellow
>=2036,<=2040,#ffff00,1.0,#ffff00,0.3,2
# 2041-2045 yellow-green
>=2041,<=2045,#adff2f,1.0,#adff2f,0.3,2
# 2046-2050 green
>=2046,Infinity,#00ff00,1.0,#00ff00,0.3,2

image

My second attempt is shown below, using custom labels. Based on this, it looks like InfoMapper is trying to insert operators even when custom labels are used. The value of ${valueMin} and ${valueMax} should not include the operator so as to avoid confusion and give users flexibility. In general, the legend can probably be made simpler than the internal values and operators. We can enhance to also have access to the operator but I think it is confusing and easier to force the user to configure.

# Category classification table for Municipal Land Development raster.
# - the value corresponds to raster data value for year
# - set colors to indicate red for most immediate development pressure
valueMin,valueMax,color,opacity,fillColor,fillOpacity,weight,label
# 2018 Current - Black (might change to gray)
-Infinity,<=2018,#000000,1.0,#000000,0.3,2,"<= ${valueMax}"
# 2019-2020 Near-term red
>=2019,<=2020,#ff0000,1.0,#ff0000,0.3,2,"${valueMin} - ${valueMax}"
# 2021-2025 red-orange
>=2021,<=2025,#ff4500,1.0,#ff4500,0.3,2,"${valueMin} - ${valueMax}"
# 2026-2030 orange
>=2026,<=2030,#ffa500,1.0,#ffa500,0.3,2,"${valueMin} - ${valueMax}"
# 2031-2035 orange-yellow
>=2031,<=2035,#f8d568,1.0,#f8d568,0.3,2,"${valueMin} - ${valueMax}"
# 2036-2040 yellow
>=2036,<=2040,#ffff00,1.0,#ffff00,0.3,2,"${valueMin} - ${valueMax}"
# 2041-2045 yellow-green
>=2041,<=2045,#adff2f,1.0,#adff2f,0.3,2,"${valueMin} - ${valueMax}"
# 2046-2050 green
>=2046,Infinity,#00ff00,1.0,#00ff00,0.3,2,">= ${valueMin}"

image

smalers commented 3 years ago

I tested again and it is still not right. My previous comment still applies. If I specify the label then I am required to put the operator symbol(s) that are appropriate. The ${valueMin} and ${valueMax} should ONLY include a number, not an operator. In other words, if the column valueMin contains >= 1234, then ${valueMin} for the label will be1234, NOT>= 1234`. Otherwise, the user does not have flexibility in configuring the label.

Nightsphere commented 3 years ago

The newest changes have been implemented. If a user-defined label header has been given in the CSV classification file, the InfoMapper will replace any operators when parsing the ${property} notation in the label string. This way, the valueMin and valueMax fields are untouched and can be used accordingly, and the label will just show the number/whatever the value is.

Nightsphere commented 2 years ago

I believe this issue can now be closed. The major implementation for the issue has been done, and any other smaller issues can be created in the future if need be. Closing the issue.