OpenWaterFoundation / cdss-app-snodas-tools

Colorado's Decision Support Systems (CDSS) Snow Data Assimilation System (SNODAS) Tools
8 stars 4 forks source link

Geojson file has truncated (shapefile) properties #4

Open smalers opened 7 years ago

smalers commented 7 years ago

The intent of publishing the geojson file is to have attributes that agree with the CSV file as per the memo that we worked on settling on CSV column names. Geojson has no limit on property name limits. Why do the daily GeoJSON files have the shapefile names?

The daily GeoJSON files should have the long attribute names if that is possible.

egiles16 commented 7 years ago

This has been fixed in the Python script. To solve this problem I had to code the following:

  1. Export the daily GeoJSON file from the shapefile vector object using QGIS (the GeoJSON file has the short shapefile fieldnames)

  2. Import the daily GeoJSON file as a QGS vector object.

  3. Use QGIS to manipulate the field names of the GeoJSON vector object (to display the long CSV fieldnames).

  4. Export a new daily GeoJSON file from the edited GeoJSON vector object (now with correct long fieldnames) using QGIS.

  5. Delete the original GeoJSON file with the short fieldnames.

We will have to rerun all of the historical data to produce daily GeoJSON files that have the correct (long and descriptive) fieldnames.

smalers commented 7 years ago

I looked at a geojson file created for 20170401 and the GeoJSON attributes are not totally consistent with what we decided on when reviewing CSV column names.

EDIT_FLAG - probably not needed since used for data QC

AREA_SQKM - is this needed? Maybe OK to keep if from the original layer as long as documented.

SWEMean_mm - should be SNODAS_SWE_Mean_mm, right? Seems like an oversight.

The other CSV statistics seem to be in place. The main issues are:

1) I loaded 80 GB for the full run to Amazon S3 and would rather not have to do that again since it took almost 2 days and we get billed for bandwidth. Perhaps make the changes to address the SWEMean_mm name and then move forward with what we have. If we get the issue sorted out with OIT we can rerun and load onto their server.

2) Kory may need to update the web application to accept both names, so that old and new files can be handled. If he does that we know it will work with regenerated files, should we decide to rerun the full history.

egiles16 commented 7 years ago

EDIT_FLAG is important. We explain in detail where the Colorado input watershed basin shapefile came from in the user documentation and go into detail that some basins are edited from their original source. The edit_flag attribute informs the end user which basins are different from their NWS source.

AREA_SKQM is useful to compare the difference between the actual total area of the basin compared to the area calculated by the SNODAS Tools (land area without inclusion of large water bodies). This attribute is explained in the user documentation.

Changed the GeoJSON output attribute table from SWEMean_mm to SNODAS_SWE_Mean_mm. This is correct starting from 4/4/17. I did not rerun the entire historical process to correct the naming convention for the one GeoJSON attribute table. Therefore, the output GeoJSON files from 09/30/03 to 04/03/17 on Amazon web services have the incorrect field name of SWEMean_mm.