AtlasOfLivingAustralia / layers-service

Spatial layers - this repo for issues/doc only, not code
3 stars 4 forks source link

Load layer - PSMA Administrative Boundaries #81

Closed ansell closed 5 years ago

ansell commented 8 years ago

The government open sourced a dataset containing administrative boundaries for Australia on data.gov.au that may be useful to ALA:

https://data.gov.au/dataset/psma-administrative-boundaries

The layers for the November 2018 PSMA Administrative Boundaries include:

2016Census_I01A_AUST_IARE

2016 ABS Australian Census Selected Person Characteristics by Indigenous Status by Sex

Date: August 2016

Main field: IARE_16COD 7055 geometries, 412 unique values

2016Census_I01B_AUST_IARE

2016 ABS Australian Census Indigenous Status by Sex

Date: August 2016

Main field: IARE_16COD 7055 geometries, 412 unique values

COMPLETE_COMM_ELECTORAL

Commonwealth Electoral Boundaries captures the boundaries for Commonwealth Electorates.

Date: August 2018

Main field: NAME 7298 geometries, 151 unique values

COMPLETE_GCCSA_2016

2016 ABS Greater Capital City Statistical Areas

Date: November 2017

Main field: GCC_16NAME 6645 geometries, 16 unique values

COMPLETE_IARE_2016

2016 ABS Indigenous Areas

Date: February 2017

Main field: IARE_16NAM 7055 geometries, 412 unique values

COMPLETE_ILOC_2016

2016 ABS Indigenous Locations

Date: February 2017

Main field: ILOC_16NAM 7733 geometries, 1097 unique values

COMPLETE_IREG_2016

2016 ABS Indigenous Regions

Date: February 2017

Main field: IREG_16NAM 6663 geometries, 40 unique values

COMPLETE_LGA

Local Government Areas

Date: November 2018

Main field: LGA_NAME 6005 geometries, 560 unique values

COMPLETE_LOCALITY

Localities

Date: November 2018

Main field: NAME 16321 geometries, 14524 unique values Other fields:

COMPLETE_MB_2016

2016 ABS Mesh Blocks

Date: November 2017

Main field: MB_16CODE 358011 geometries, 359009 unique values Other fields:

COMPLETE_REMOTENESS_2016

2016 Remoteness Areas

Date: May 2018

Main field: REM16_CODE 6773 geometries, 35 unique values Other fields:

COMPLETE_SA1_2016

2016 ABS SA1

Date: November 2017

Main field: SA1_16MAIN 64043 geometries, 57490 unique values

COMPLETE_SA2_2016

2016 ABS SA2

Date: November 2017

Main field: SA2_16NAME 8921 geometries, 2292 unique values

COMPLETE_SA3_2016

2016 ABS SA3

Date: November 2017

Main field: SA3_16NAME 6971 geometries, 340 unique values

COMPLETE_SA4_2016

2016 ABS SA4

Date: November 2017

Main field: SA4_16NAME 6721 geometries, 89 unique values

COMPLETE_SEIFA_2016

Date: May 2018

Main field: SIEFA16PID 59716 geometries, 59716 unique values Other fields:

COMPLETE_SOS_2016

Date: May 2018

Main field: SOS_16NAME 8419 geometries, 4 unique values (BOUNDED LOCALITY, MAJOR URBAN, OTHER URBAN, RURAL BALANCE)

COMPLETE_SOSR_2016

Date: May 2018

Main field: SSR_16CODE` 8443 geometries, 71 unique values Other fields:

COMPLETE_STATE

Date: May 2016

Main field: STATE_NAME 12680 geometries, 9 unique values Other fields:

COMPLETE_STATE_ELECTORAL

Date: February 2018

Main field: NAME 3291 geometries, 481 unique values Other fields:

COMPLETE_SUA_2016

Date: May 2018

Main field: SUA_16NAME 6750 geometries, 110 unique values

COMPLETE_UCL_2016

Date: May 2018

Main field: UCL_16NAME 8479 geometries, 1835 unique values

COMPLETE_WARD

Date: November 2018

Main field: NAME 1481 geometries, 477 unique values

Tasilee commented 8 years ago

These do look authoritative and comprehensive. How do they compare with what we currently have loaded?

ansell commented 8 years ago

Some of the layers included there are in the process of being loaded still as part of the IEK sprint, but they were sourced separately from the various agencies/departments that publish them first. Others were loaded in the past separately and I am not sure what the process for updating those layers is.

As far as I can tell there are no additions or changes in this dataset, they just copy the most recent version of the primary source dataset and include it in this one, so there is no implicit value other than being able to source all of the most recent versions regularly. That may itself be valuable in terms of time if we can automate the updates for this dataset.

ansell commented 8 years ago

Just for reference, the shapefiles available in this dataset include the following:

Tasilee commented 8 years ago

Thanks Peter. I agree. As a one-stop shop, it makes sense to source the layers here and try to automate the process as best as can.

Just for reference,

ansell commented 8 years ago

Started work on the electoral and political boundaries layers first due to a request from MERIT. Layers uploaded and fields being created gradually

ansell commented 8 years ago

Commonwealth Electoral Boundaries, ABS SA1 2011 and ABS SA2 2011 are all triggering some kind of exception in spatial-service that results in no columns from the shapefile showing in the dropdown list when attempting to create fields based on them. The uploads are succeeding, but there are many silent exception cases that could be occurring for these files. Some of the issues being tracked on spatial-service for fixing in the future specifically related to these three layers are:

https://github.com/AtlasOfLivingAustralia/spatial-service/issues/18

https://github.com/AtlasOfLivingAustralia/spatial-service/issues/25

ansell commented 8 years ago

The Geometries for State Electoral Regions in the May 2016 release are not unique to electoral regions. In particular, for a few cases there are multiple "Electoral Regions" encompassing some of WA. The only reason that this is inherently visible to me programmatically is that I was expecting the "geom" field to be unique when joined to SE_PID or SE_PLY_PID, but there are identical geom field MULTIPOLYGON values assigned to different SE_PID's via different SE_PLY_PID's that indicate there are two different state electoral regions each encompassing the same physical boundaries which seems to be incorrect given my understanding of electoral regions.

ansell commented 8 years ago

In the UCL_2011 dataset, UCL_11CODE, SSR_11CODE, and SOS_11CODE do not change when the equivalent names, UCL_11NAME, SSR_11NAME, and SOS_11NAME change. Hence, the Codes should not be relied on for programmatic identification, only the PIDs that do seem to change to match the names (1821 unique values for both UCL_11PID and UCL_11NAME)

ansell commented 8 years ago

Hi @adam-collins, the shapefiles distributed by PSMA Australia Limited that are part of this issue, (except for Town Points which are not Polygon based), have now been loaded into spatial-test with the following field numbers.

The actual copying of these to production would likely make Cassandra/Solr/Biocache push their memory limitations with respect to record size, so the data team (cc @M-Nicholls) do not want these pushed to production before the Cassandra 3+sharding and Solr Cloud 5/6 changes being worked on by @djtfmartin occur, which won't be happening until at least after the BIE updates, but letting you know which fields have been added so you can review them during the spatial sprint.

The process of producing these shapefiles for the future PSMA quarterly releases will involve some small changes to field names, as things like "2011" are encoded in field and file names as "11", but otherwise should be fairly simple in terms of interaction.

ansell commented 7 years ago

Hi @adam-collins could you push across the indigenous related layers from this work, 10881, 10882, 10883, 10884, 10885, 10886. They have all had their "include in biocache index" settings switched off and they are not currently set to replace existing layers so they will not affect biocache index record sizes. They are urgently required by @StephanievG for a conference next week. Refs https://github.com/AtlasOfLivingAustralia/seasonal-calendar/issues/39

ansell commented 6 years ago

Currently still presuming that the previous layer uploads for this cannot be rescued after the database issue.

A future version of this will focus initially on whatever the latest layers are from PSMA/ABS at the time this issue is worked on. This is likely to include recent electoral boundary redistributions, the 2016 census, and the latest ABS statistics

ansell commented 5 years ago

Freshdesk issue 29289 requires LGA updated to a version less than 2 years old to pickup a new council name. Adding the May 2016 layer may not do that, so will be best to load from the latest PSMA dataset when scheduling for this issue comes up.