NewGraphEnvironment / dff-2022

Building digital field forms and processing data collected using R, postgresql, QGIS and other tools
Creative Commons Zero v1.0 Universal
0 stars 3 forks source link

streams/crossings extracts - column type incompatible #52

Closed smnorris closed 1 year ago

smnorris commented 1 year ago

Pulling from the posted .gdb is a nice workaround vs trying to page through the api. Unfortunately esri file geodatabases appear not to support a boolean type and ogr is translating the habitat columns to an integer. This is not compatible with existing map layer definitions in qgis. Maybe a -sql statement in the ogr2ogr command can fix?

$ ogrinfo bcfishpass.gdb streams -so
INFO: Open of `bcfishpass.gdb'
      using driver `OpenFileGDB' successful.

Layer name: streams
Geometry: Multi Line String
....<etc>
spawning_model_ch: Integer (0.0)
spawning_model_co: Integer (0.0)
spawning_model_sk: Integer (0.0)
spawning_model_st: Integer (0.0)
spawning_model_wct: Integer (0.0)
rearing_model_ch: Integer (0.0)

https://github.com/NewGraphEnvironment/dff-2022/blob/master/scripts/background_layers.sh#L61

smnorris commented 1 year ago

I'd post as geopackage but user feedback about accessing that format (especially for large layers like streams) has been poor.

smnorris commented 1 year ago

Interestingly, sqlite/geopackage boolean support is much the same as file geodatabase: https://www.sqlite.org/datatype3.html#:~:text=SQLite%20does%20not%20have%20a,literals%201%20and%200%20respectively

But ogr2ogr does a better job at encoding them than with the file geodatabase driver - this is from an existing "clipped_streams.gpkg"

spawning_model_ch: Integer(Boolean) (0.0)
spawning_model_co: Integer(Boolean) (0.0)
spawning_model_sk: Integer(Boolean) (0.0)
spawning_model_st: Integer(Boolean) (0.0)
spawning_model_wct: Integer(Boolean) (0.0)
rearing_model_ch: Integer(Boolean) (0.0)
rearing_model_co: Integer(Boolean) (0.0)
rearing_model_sk: Integer(Boolean) (0.0)
rearing_model_st: Integer(Boolean) (0.0)
rearing_model_wct: Integer(Boolean) (0.0)
smnorris commented 1 year ago

Using the -sql option and wrapping the habitat columns like cast(spawning_model_ch as boolean) as spawning_model_ch, works fine.

But:

I'll host flatgeobuf versions of the data. It is more redundant data but the format is really nice for quick downloads. Also handy as a fast input to tippecanoe if we move to storing the streams as vector tiles in the project.

smnorris commented 1 year ago

Test of extracting VICT from the two file formats:

I think the redundant data is ok

smnorris commented 1 year ago

fixed with 065b288a16ddfb684092d206ae4e54df473e3a18. Extracts are now pretty quick. I wonder how pgfs compares to requests from a flatgeobuf.