Display the infrastructure layer

robyngit commented 2 years ago

Data has been uploaded to: datateam.nceas.ucsb.edu:/home/pdg/data/bartsch_infrastructure
This layer will support the IWP layer (https://github.com/PermafrostDiscoveryGateway/pdg-portal/issues/24)

The data is associated with the following two papers:

Expanding infrastructure and growing anthropogenic impacts along Arctic coasts (pdf) (2020)
Towards Circumpolar Mapping of Arctic Settlements and Infrastructure Based on Sentinel-1 and Sentinel-2 (2021)

The data are archived with Restricted Access on Zenodo

robyngit commented 2 years ago

I did a test run of this layer. Here are some notes on processing it:

The readme included in the dataset gives all the relevant info
The single file that we want to process is the SACHI.shp file
This file takes ~20 sec to open with GeoPandas. It needs to be tiled to display in Cesium.
The workflow that we developed for IWP works well to create web-tiles + 3d tiles. No parallelization is necessary. Since it's just one file, no deduplication is required either.

I used the following config options, but we will probably want to create tiles higher resolution than z 10 (perhaps 13):

{
  "z_range": [0, 10],
  "statistics": [
    {
      "name": "coverage",
      "weight_by": "area",
      "property": "area_per_pixel_area",
      "aggregation_method": "sum",
      "resampling_method": "average",
      "val_range": [
        0,
        1
      ],
      "nodata_val": 0,
      "palette": "oryel",
      "nodata_color": "#ffffff00"
    }
  ]
}

Preview in Cesium:

robyngit commented 2 years ago

Based on feedback from Annett (below), we should do the following with the next run:

change the resampling method from the mean to "nearest neighbour"
colour-code the layer based on type (e.g.road, building, other)

We should also remember to re-create the layer when updated data is available next year.

Details:

If displaying as raster, I would suggest to do no resampling which does averaging as it is discrete information (yes or no/classes). Otherwise you are introducing new information (on size, bright if narrow, dark if it is a larger polygon), what could be misunderstood as some thematic content. I would suggest 'nearest neighbour' instead of 'bilinear' . In that case you could also differentiate types (road, building, other).

Note that we are just about to complete an updated version. It covers a slightly larger area and has additional classes (three road types, airstrips and reservoirs as extra/additional objects). But I assume that quality control will not be finished before the end of the year.

julietcohen commented 10 months ago

Annett has produced a new version of this dataset, now located on Datateam at: /var/data/submission/pdg/bartsch_infrastructure/SACHI_v2/

Old version of the data is now in /var/data/submission/pdg/bartsch_infrastructure/old_version/

Initial notes:

resampling method in viz-raster is set to nearest neighbor by default but is overwritten by whatever is specified in the config. See from_rasters()
SACHI_v2.shp has only geometries and a DN column, which as unique values: 11, 12, 13, 20, 30, 40, 50
- need to look into available metadata to determine what this code represents
2,754,082 rows, 673 MB

julietcohen commented 10 months ago

Notes:

In order to understand the unique values of the infrastructure codes, I need access to the Zenodo page Robyn linked in the initial issue comment, which contains a README of the description of the data fields. I requested access via the Zenodo form and am awaiting approval.
A visual to understand the distribution of the attribute we're mapping:

The DN col has no NA values but the geometry column has 94,924 NA (none) values. That's only ~3.4% of all rows.
CRS of input data:
all geometries:

julietcohen commented 10 months ago

Data after converting to EPSG 4326:

julietcohen commented 10 months ago

I was able to process a first draft of outputs with the visualization workflow, but importantly in order to produce the web tiles I needed to use the same modification to to_image() in this pull request that I originally used to successfully produce web tiles for the permafrost and ground ice layer (see this comment in that issue). While processing both datasets, they gave the same error and failed to write any web tiles until I added a line in to_image() that converts the image data to float. This gives me more confidence that this PR should be merged. A similarity of these 2 datasets is they both visualize categorical variables. In the permafrost dataset, we are visualizing 4 categories of permafrost coverage, and in this infrastructure dataset we are visualizing 7 types of infrastructure.

julietcohen commented 10 months ago

On Zenodo, I was granted access to the older version of the dataset which includes a README that includes attribute descriptions, which I was hoping would explain the infrastructure codes we see in this newer version of the dataset. Unfortunately, the attribute DN is not described in the older README.

First draft of the data layer with low resolution (set to max z10):

To do:

reach out to Annett to find out description for DN attribute
find out resolution of the data to set proper max zoom level
- when starting over workflow from staging with higher max-z, do it on Delta on a single node so it's faster
- otherwise in order to take advantage of parallelization with either parsl or ray, would need to split the input file into multiple files because there is only 1
play with config to resolve visualization issue that infrastructure units appear as 2 different colors in different tiles, like a single road turns from green to blue
- examples: change agg method, add val range to config
merge the to_image() PR into viz-raster develop then main branches and release before can link to version of the software that produced this dataset
remind myself where the output warning message comes from, because I know I have seen it before and documented it in another issue
- RuntimeWarning: divide by zero encountered in scalar divide (255 / (max_val - min_val))

script

```python # process the infrastructure data # start at lower z-level initially, # then find the real res of the data and # apply the appropriate max z-level # filepaths from pathlib import Path import os # visual checks & vector data wrangling import geopandas as gpd # staging import pdgstaging from pdgstaging import TileStager # rasterization & web-tiling import pdgraster from pdgraster import RasterTiler # logging from datetime import datetime import logging import logging.handlers from pdgstaging import logging_config # for transferring the log to workdir import subprocess from subprocess import Popen # -------------------------------------------------------------- data_dir = '/home/jcohen/infrastructure/data/' # define workflow configuration config = { "dir_input": data_dir, "ext_input": ".shp", "ext_footprints": ".shp", "dir_staged": "staged/", "dir_geotiff": "geotiff/", "dir_web_tiles": "web_tiles/", "filename_staging_summary": "staging_summary.csv", "filename_rasterization_events": "raster_events.csv", "filename_rasters_summary": "raster_summary.csv", "filename_config": "config", "simplify_tolerance": 0.1, "tms_id": "WGS1984Quad", "z_range": [ 0, 10 ], "geometricError": 57, "z_coord": 0, "statistics": [ { "name": "infrastructure_code", "weight_by": "area", "property": "DN", "aggregation_method": "max", # TODO: need to think about this one more, maybe should be "sum" or something else "resampling_method": "nearest", "palette": [ "#f48525", # orange "#f4e625", # yellow "#47f425", # green "#25f4e2", # turquoise "#2525f4", # blue "#f425c3", # pink "#f42525" # red ], "nodata_val": 0, "nodata_color": "#ffffff00" }, ], "deduplicate_at": None, "deduplicate_keep_rules": None, "deduplicate_method": None } # -------------------------------------------------------------- print("Staging initiated.") # stage the tiles stager = TileStager(config) stager.stage_all() # -------------------------------------------------------------- print("Rasterizing and Web-tiling initiated.") # rasterize all staged tiles, resample to lower resolutions, # and produce web tiles RasterTiler(config).rasterize_all() # transfer log from /tmp to user dir # add subdirectories as needed user = subprocess.check_output("whoami").strip().decode("ascii") cmd = ['mv', '/tmp/log.log', f'/home/{user}/infrastructure/'] # initiate the process to run that command process = Popen(cmd) print("Script complete.") ```

julietcohen commented 10 months ago

Adding a val_range to the config results in more consistency in the pallette for each infrastructure unit within the web tiles
- val_range includes the smallest possible infrastructure code (11) to the largest (50)
- for example: one road will be the same color from one end to the other
Pre-processing steps: 1) Removed the geometries that are NA before inputting in the workflow to avoid the warning output in the first run 2) split the input data into 6 files to process in parallel using Parsl on Datateam
Data layer is now on demo: Infrastructure (SACHI version 2)
I reached out to Annett to get clarification about which code represents which infrastructure type
- could be the 7 classes of infrastructure described in one of the papers Robyn linked at the top of this issue: fishing, agriculture (mostly reindeer herding), gas/oil industry, mining, other use (e.g. transport hub), abandoned, and unknown

clean_infrastructure.py

```python # Arctic infrastructure data from Annett B. # clean by removing rows with NA geometries # and split data into smaller files for processing # in parallel import geopandas as gpd import numpy as np data_path = "/home/jcohen/infrastructure/data/SACHI_v2.shp" data = gpd.read_file(data_path) # remove NA values from the geometry column data_clean = data[data['geometry'].notna()] # split the cleaned data into 6 subfiles # each file will have equal number of rows split_gdfs = np.array_split(data_clean, 6) for i, split_gdf in enumerate(split_gdfs): split_gdf.reset_index(inplace = True) # Create the filename with a different number ranging from 1 to 5 filename = f'/home/jcohen/infrastructure/data_cleaned_split/SACHI_v2_clean_{i + 1}.gpkg' split_gdf.to_file(filename, driver = "GPKG", index = True) print(f'Saved {filename}') print("Script complete.") ```

parsl script

```python # Processing the infrastructure data from Annett # see issue: https://github.com/PermafrostDiscoveryGateway/pdg-portal/issues/27 # staging through web tiling with parsl # conda environment: viz-local, with local installs # for viz-staging and viz-raster, but with modified viz-raster # because need to_image() fix (see PR), # and pip install for parsl==2023.11.27 import pdgstaging import pdgraster from datetime import datetime import json import logging import logging.handlers from pdgstaging import logging_config import os import parsl from parsl import python_app from parsl.config import Config from parsl.channels import LocalChannel from parsl.executors import HighThroughputExecutor from parsl.providers import LocalProvider import shutil import subprocess from subprocess import Popen user = subprocess.check_output("whoami").strip().decode("ascii") # start with a fresh directory! print("Removing old directories and files...") base_dir = "/home/jcohen/infrastructure/parsl_workflow/" old_filepaths = [f"{base_dir}staging_summary.csv", f"{base_dir}raster_summary.csv", f"{base_dir}raster_events.csv", f"{base_dir}config__updated.json", f"{base_dir}log.log"] for old_file in old_filepaths: if os.path.exists(old_file): os.remove(old_file) # remove dirs from past run old_dirs = [f"{base_dir}staged", f"{base_dir}geotiff", f"{base_dir}web_tiles", f"{base_dir}runinfo"] for old_dir in old_dirs: if os.path.exists(old_dir) and os.path.isdir(old_dir): shutil.rmtree(old_dir) activate_conda = 'source /home/jcohen/.bashrc; conda activate viz-local' htex_local = Config( executors=[ HighThroughputExecutor( label="htex_local", worker_debug=False, #cores_per_worker=1, max_workers=6, provider=LocalProvider( #channel=LocalChannel(), init_blocks=1, max_blocks=1, worker_init=activate_conda ), ) ], ) parsl.clear() parsl.load(htex_local) def run_pdg_workflow( workflow_config, batch_size = 300 ): """ Run the main PDG workflow for the following steps: 1. staging 2. raster highest 3. raster lower 4. web tiling Parameters ---------- workflow_config : dict Configuration for the PDG staging workflow, tailored to rasterization and web tiling steps only. batch_size: int How many staged files, geotiffs, or web tiles should be included in a single creation task? (each task is run in parallel) Default: 300 """ start_time = datetime.now() logging.info("Staging initiated.") stager = pdgstaging.TileStager(workflow_config) tile_manager = stager.tiles config_manager = stager.config input_paths = stager.tiles.get_filenames_from_dir('input') input_batches = make_batch(input_paths, batch_size) # Stage all the input files (each batch in parallel) app_futures = [] for i, batch in enumerate(input_batches): app_future = stage(batch, workflow_config) app_futures.append(app_future) logging.info(f'Started job for batch {i} of {len(input_batches)}') # Don't continue to next step until all files have been staged [a.result() for a in app_futures] logging.info("Staging complete.") # ---------------------------------------------------------------- # Create highest geotiffs rasterizer = pdgraster.RasterTiler(workflow_config) # Process staged files in batches logging.info(f'Collecting staged file paths to process...') staged_paths = tile_manager.get_filenames_from_dir('staged') logging.info(f'Found {len(staged_paths)} staged files to process.') staged_batches = make_batch(staged_paths, batch_size) logging.info(f'Processing staged files in {len(staged_batches)} batches.') app_futures = [] for i, batch in enumerate(staged_batches): app_future = create_highest_geotiffs(batch, workflow_config) app_futures.append(app_future) logging.info(f'Started job for batch {i} of {len(staged_batches)}') # Don't move on to next step until all geotiffs have been created [a.result() for a in app_futures] logging.info("Rasterization highest complete. Rasterizing lower z-levels.") # ---------------------------------------------------------------- # Rasterize composite geotiffs min_z = config_manager.get_min_z() max_z = config_manager.get_max_z() parent_zs = range(max_z - 1, min_z - 1, -1) # Can't start lower z-level until higher z-level is complete. for z in parent_zs: # Determine which tiles we need to make for the next z-level based on the # path names of the geotiffs just created logging.info(f'Collecting highest geotiff paths to process...') child_paths = tile_manager.get_filenames_from_dir('geotiff', z = z + 1) logging.info(f'Found {len(child_paths)} highest geotiffs to process.') # create empty set for the following loop parent_tiles = set() for child_path in child_paths: parent_tile = tile_manager.get_parent_tile(child_path) parent_tiles.add(parent_tile) # convert the set into a list parent_tiles = list(parent_tiles) # Break all parent tiles at level z into batches parent_tile_batches = make_batch(parent_tiles, batch_size) logging.info(f'Processing highest geotiffs in {len(parent_tile_batches)} batches.') # Make the next level of parent tiles app_futures = [] for parent_tile_batch in parent_tile_batches: app_future = create_composite_geotiffs( parent_tile_batch, workflow_config) app_futures.append(app_future) # Don't start the next z-level, and don't move to web tiling, until the # current z-level is complete [a.result() for a in app_futures] logging.info("Composite rasterization complete. Creating web tiles.") # ---------------------------------------------------------------- # Process web tiles in batches logging.info(f'Collecting file paths of geotiffs to process...') geotiff_paths = tile_manager.get_filenames_from_dir('geotiff') logging.info(f'Found {len(geotiff_paths)} geotiffs to process.') geotiff_batches = make_batch(geotiff_paths, batch_size) logging.info(f'Processing geotiffs in {len(geotiff_batches)} batches.') app_futures = [] for i, batch in enumerate(geotiff_batches): app_future = create_web_tiles(batch, workflow_config) app_futures.append(app_future) logging.info(f'Started job for batch {i} of {len(geotiff_batches)}') # Don't record end time until all web tiles have been created [a.result() for a in app_futures] end_time = datetime.now() logging.info(f'⏰ Total time to create all z-level geotiffs and web tiles: ' f'{end_time - start_time}') # ---------------------------------------------------------------- # Define the parsl functions used in the workflow: @python_app def stage(paths, config): """ Stage a file """ from datetime import datetime import json import logging import logging.handlers import os import pdgstaging from pdgstaging import logging_config stager = pdgstaging.TileStager(config = config, check_footprints = False) for path in paths: stager.stage(path) return True # Create highest z-level geotiffs from staged files @python_app def create_highest_geotiffs(staged_paths, config): """ Create a batch of geotiffs from staged files """ from datetime import datetime import json import logging import logging.handlers import os import pdgraster from pdgraster import logging_config # rasterize the vectors, highest z-level only rasterizer = pdgraster.RasterTiler(config) return rasterizer.rasterize_vectors( staged_paths, make_parents = False) # no need to update ranges because manually set val_range in config # ---------------------------------------------------------------- # Create composite geotiffs from highest z-level geotiffs @python_app def create_composite_geotiffs(tiles, config): """ Create a batch of composite geotiffs from highest geotiffs """ from datetime import datetime import json import logging import logging.handlers import os import pdgraster from pdgraster import logging_config rasterizer = pdgraster.RasterTiler(config) return rasterizer.parent_geotiffs_from_children( tiles, recursive = False) # ---------------------------------------------------------------- # Create a batch of webtiles from geotiffs @python_app def create_web_tiles(geotiff_paths, config): """ Create a batch of webtiles from geotiffs """ from datetime import datetime import json import logging import logging.handlers import os import pdgraster from pdgraster import logging_config rasterizer = pdgraster.RasterTiler(config) return rasterizer.webtiles_from_geotiffs( geotiff_paths, update_ranges = False) # no need to update ranges because val_range is # defined in the config def make_batch(items, batch_size): """ Create batches of a given size from a list of items. """ return [items[i:i + batch_size] for i in range(0, len(items), batch_size)] # ---------------------------------------------------------------- # run the workflow workflow_config = '/home/jcohen/infrastructure/parsl_workflow/config.json' print("Loaded config. Running workflow.") run_pdg_workflow(workflow_config) # Shutdown and clear the parsl executor htex_local.executors[0].shutdown() parsl.clear() # transfer log from /tmp to user dir cmd = ['mv', '/tmp/log.log', f'/home/{user}/infrastructure/parsl_workflow/'] # initiate the process to run that command process = Popen(cmd) print("Script complete.") ```

config

``` { "dir_input": "/home/jcohen/infrastructure/data_cleaned_split", "ext_input": ".gpkg", "dir_staged": "staged/", "dir_geotiff": "geotiff/", "dir_web_tiles": "web_tiles/", "filename_staging_summary": "staging_summary.csv", "filename_rasterization_events": "raster_events.csv", "filename_rasters_summary": "raster_summary.csv", "filename_config": "config", "simplify_tolerance": 0.1, "tms_id": "WGS1984Quad", "z_range": [ 0, 12 ], "geometricError": 57, "z_coord": 0, "statistics": [ { "name": "infrastructure_code", "weight_by": "area", "property": "DN", "aggregation_method": "max", "resampling_method": "nearest", "val_range": [ 11, 50 ], "palette": [ "#f48525", "#f4e625", "#47f425", "#25f4e2", "#2525f4", "#f425c3", "#f42525" ], "nodata_val": 0, "nodata_color": "#ffffff00" } ], "deduplicate_at": null, "deduplicate_keep_rules": null, "deduplicate_method": null, "clip_to_footrpint": false } ```

elongano commented 10 months ago

Category: Infrastructure

julietcohen commented 10 months ago

Annett has provided the README for this dataset. It provides a code for each infrastructure type and is located in /var/data/submission/pdg/bartsch_infrastructure/

julietcohen commented 8 months ago

The dataset package on the ADC: https://arcticdata.io/catalog/view/urn%3Auuid%3A4e1ea0af-6f7c-4a7a-b69f-4e818e113c43

Be sure to navigate to the newest version of the dataset with the link in the top banner.
This is restricted to private access until we finish it and Annett approves, then we will assign it with the DOI I already generated: 10.18739/A21J97929
I released the version 0.9.2 of viz-raster that was used to process this dataset so we can link to it in the metadata
the most recent version of viz-staging (v0.9.1) was used for this dataset so no need to make a new release for this data package

The final dataset has been processed and is archived on datateam:

Everything besides web tiles are located at /var/data/10.18739/A21J97929/
- The subdir input contains:
- the raw SACHI_v2 data file from Annett (just 1 shapefile and it's extension files)
- the README for this new version of the dataset that Annett emailed
- the cleaning script for the shapefile because it contained NA values that needed to be removed
- the output file of that cleaning script SACHI_v2_clean.gpkg that was actually used as input to the viz workflow
The web tiles are at /var/data/tiles/10.18739/A21J97929/
- The tiles are visualized on the demo PDG portal. I have not received any feedback or requested changes to the output, so I will move forward with the data package processing. If there needs to be changes to anything moving forward, it would be ideal to get that feedback before we publish the dataset with the DOI.

julietcohen commented 8 months ago

viz-workflow v0.9.2 has been released (that was used to produce this data layer). The Only thing blocking this layer from moving to production is the ADC data package needs to be finished and published. I have been adding more metadata to this package the past week, so it is on it's way to getting assigned it's pre-issued DOI!

julietcohen commented 8 months ago

Justin from the ADC has helpfully taken over the rest of the metadata documentation for this dataset, starting this week.

julietcohen commented 8 months ago

Annett has requested that I change the palette of this dataset so that buildings are in red and water is in blue. I also noticed that there were MultiPolygons (only 2% of the geometries) in the input data from Annett. I re-cleaned this input data (still removed NA geoms like last time, with the additional cleaning step of exploding those MultiPolygon geoms) and will re-process the data with the viz workflow the requested palette change as well. This is not a time consuming operation on Delta.

This new cleaning script will replace the old one that is uploaded to the data package and the staged and geotiff tilesets will replace the ones currently in the pre-assigned DOI dir. I already discussed this with Justin.

For convenience, I am pasting the values for infrastructure code here (they are from the readme):

11=linear transport infrastructure (asphalt)
12=linear transport infrastructure (gravel)
13=linear transport infrastructure (undefined),
20=buildings (and other constructions such as bridges)
30=other impacted area (includes gravel pads, mining sites)
40=airstrip
50=reservoir or other water body impacted by human activities

julietcohen commented 8 months ago

Annett also mentioned that she is going to update the dataset on Zenodo with the new version this week, and it already has a DOI there. Justin let her know we will still use our pre-assigned DOI since it is a new dataset, and I went into detail about our derived products. We will mention the Zenodo DOI in the dataset documentation on the ADC.

julietcohen commented 8 months ago

Delta decided that python doesn't exist anymore in any of my environments so I have been troubleshooting that for the past hour. Next step would be to uninstall VScode and reinstall, and remove the known hosts for Delta. I already tried uninstalling all python extensions and reinstalling.

julietcohen commented 8 months ago

Annett has updated the dataset based on my feedback regarding the geometries. She uploaded the new version of SACHI_v2 to Zenodo. She included the following info:

DOI 10.5281/zenodo.10160636 I made a number of changes based on your feedback: 1) I had a closer look at the geometry properties. The version which you got had duplicates and the overlap areas of the different sentinel-2 source granules were not yet merged. This is now solved and the there are now also no features without attributes any more 2) I extended the readme file for the meta data: S2_date1 to S2_date3 - dates of individual Sentinel-2 images used for averaging S1_winter - year(s) of Sentinel-1 images used for averaging (months December and/or January)

Justin and I made it clear that we would be giving the dataset a new DOI (the one I pre-created for this dataset) because our ADC version of the data package differs from the one she has on Zenodo, considering all our derived products.

Since she made changes to SACHI_v2, I will upload the new version to Datateam and reprocess the dataset with the viz workflow.

julietcohen commented 7 months ago

I reprocessed the infrastructure dataset with Annett's new version of SACHI_v2 data as input to the viz workflow. Since she fixed the geometries, I only had to split the multipolygons before inputting the data into the viz workflow. I also updated the palette to represent buildings in red and water in blue as Annett requested and removed yellow from the palette since we visualize the ice-wedge polygon layer in yellow, and we anticipate that users will explore these layers together. All input and output and the new pre-viz claning script have been uploaded to /var/data/10.18739/A21J97929. The output web tiles have been uploaded to /var/data/tiles/10.18739/A21J97929/

I updated the demo portal with this layer:

julietcohen commented 7 months ago

I was able to refine the assignment of the categorical palette so there is just 1 color attributed to 1 infrastructure type. I did this in a similar way that I did for the permafrost and ground ice dataset. Instead of using the attribute DN for the visualized tiles, I needed to create a new attribute which I named palette_code that assigns numbers 1 through 7 to each of the DN numbers because the numbers for the palette assignment seem to need to be evenly spaced in order for 1 color to correspond to 1 categorical value. The values of DN are numerical but are not evenly spaced.

# add attribute that codes the categorical DN attribute
# into evenly spaced numbers in order to assign 
# palette correctly to the categories in the web tiles:
conditions = [
    (data['DN'] == 11),
    (data['DN'] == 12),
    (data['DN'] == 13),
    (data['DN'] == 20),
    (data['DN'] == 30),
    (data['DN'] == 40),
    (data['DN'] == 50)
]
choices = [1, 2, 3, 4, 5, 6, 7 ]
data['palette_code'] = np.select(conditions, choices)

As a result, I see more diversity in the colors of the geometries, as Annett suggested they should look, and I integrated grey for the 30 value of DN per her suggestion:

Taking this approach, I realize that in order for Annett's metadata for the DN values to match the numbers in the raster, I need to include both attributes (DN and palette_code) as bands in the rasters. This is because we will use the web tiles for palette_code as the layer we visualize on the portal, and users will want the option to use either band when they download the rasters for their own analysis.

One note is that I have only tried this dataset with a custom palette with each of the 7 hex codes specifically assigned rather than a pre-made palette with a name, because we wanted to include certiain colors in a specific order, and exclude certain colors that would too closely match other datasets on the PDG.

julietcohen commented 7 months ago

One more consideration for this dataset: the resolution of this data is not clear on the zenodo page (this new zenodo link is important because it points to version 2 of the dataset that Annett updated, not version 1 which was originally linked above when this ticket was first created). Sentinel data resolution can vary depending on the bands used. One way to find this may be to dive deeper into the methods, like a paper associated with this dataset, or ask Annett. I have been using z-level 12 as the max, cause that is ~32m resolution at latitude +/- 31

julietcohen commented 7 months ago

Annett has approved the layer to be moved to production. The remaining to-do items, in order:

[x] process the infrastructure layer on Delta for the final time
- with z-13 as highest zoom (significantly increasing the number of tiles)
- with 2 stats: infrastructure_code (DN) and palette_code
[x] delete old tilesets from Datateam
[x] move tilesets and other files (like the cvs's and log) to Datateam
[x] remove the useless node dir (cn014) from the staged dir (just moving everything within cn014 up one level then removing that dir)
[x] update the vector and geotiff tile examples in the ADC package
[x] update other metadata details that need to be clarified now that there is 1 extra attribute in the vectors, 2 stats in the geotiffs, and a higher resolution for the highest z geotiffs
[x] expand the description for the layer in the left pane
[x] on the portal, add the attribution based on the citation for the ADC package
[x] get EZID credentials from Matt
[x] make EZID for the viz-workflow v0.9.3 release
[x] release viz-workflow v0.9.3 with new config used for infrastructure layer processing, update that release version in the ADC package
[x] make package entities ghost entities for example files for staged and geotiff
[x] one last look over the ADC package and publish
- need to double check for the ;amp; things that appear in the text when special characters are used
- make abstract markdown formatting render by editing the XML
- make sure all file names are updated in methods since files were replaced
[x] move layer to production
- make sure legend is updated
[x] inform Annett

mbjones commented 7 months ago

@julietcohen In reading through your correspondence on this ticket, I saw that it originated as a vector dataset. The most accurate way for us to present it would be as a vector layer, rather than raster. Let's please discuss this as I suspect it would provide a much better result. We have a number of vector layers in the portals now, mainly via a geoJSON conversion. I'm not sure if the dataset size would be prohibitive there, but let's discuss please.

PermafrostDiscoveryGateway / pdg-portal

Display the infrastructure layer #27

Notes:

To do: