Closed heidimok closed 2 months ago
Hi @RicardoGomesIST, this was one of the open questions for the municipal data. We were wondering where thecodigo
value on the spreadsheet comes from as we don’t see that field in the spatial data (municipalities_gadm41_PRT_2.zip). Would you be able to comment on it? Feel free to ask any questions here as well as a comment and the team can clarify.
cc @alukach - please also comment if this open question is no longer relevant as I know things can change fast as we continue to make progress.
Hi Heidi, the code should come from the Portuguese National Statistics I believe. Why? Do you wish to use it for the data assignment? In fact, there are some Municipalities in Portugal with the same name (from mainland and Islands). I will ask Mariana to join us as she developed this dataset. Her name in Github is marianajanuario97 - can you please add her?
Thanks @RicardoGomesIST!
@alukach given the info, would you be able to provide more guidance here on what the ask is in terms of adjusting the data in any way?
Also, I can add Mariana!
Hi. I corrected the municipalities' shapefile with added data in the "CC_2" column. Also, I shared with @yellowcap the "Municipal Data v6" Google sheets file with the column "Código" corrected (it had data in the number format and is now converted to text. So the code "101" is now "0101" for instance). So the columns "Código" and "CC_2" can join. Hope it helps!
Here is the shapefile updated:
Made a copy of this into our drive. https://docs.google.com/spreadsheets/d/1Q-qWsO0N3bxbAQJu27_Zny79bGINhMvLeZY0lQdkNvQ/edit
Note: This is a Slack discussion (from @alukach) I'm bringing into GitHub so we can track ongoing municipal data questions and updates as comments here for better communication.
Context
We received the latest municipal data (v5) from Ricardo. Municipal Data v5 - Google Sheets
Some outstanding issues with the data:
The following rows in the
metrics
sheet are missing corresponding geometries in the municipal spatial data:3101, 3103, 3102, 4801, 4401, 4601, 4502, 4301, 4901, 3110, 3104, 4501, 3108, 4603, 3106, 3107, 3201, 4701, 4302, 4802, 3109, 4602, 3105
Additionally, the Study sheet has a few issues. Some field names needed to be update (some of these changes to requirements are new, not Tecnico's fault in any way) and the fields that are being used to join Geometries to Metrics are incorrect. Anthony was able to get the ingestion working by making the following changes to the study sheet:
A corrected sheet can be found here: https://docs.google.com/spreadsheets/d/1Zwoohp8Zq8D4LgLY1aIrB5LTLRvxmV9dr8FAghphCV0/edit#gid=1485987722
Data Issues
Here is the output from the seed operation (ie ingesting the geometries & metrics). Joining on the
NAME_2
field of the geometry is somewhat problematic in that there duplicate geometries that share that field in the geospatial data. If a duplicate is found in the geometry, we log + ignore and continue. If a geometry is found without a corresponding metric, we log + ignore and continue.Reading the above logs, we have a geometry with
NAME_2=Praia da Vitória
that has no metric. We also have a metric withnombre=Vila da Praia da Vitória
with no geometry. It seems reasonable that the metric row should be updated to trimVila da
from the name. As per https://github.com/developmentseed/tecnico-energy-app/issues/26, we drop any metrics that don’t have geometries.The core issue is that we’re joining the geometry data with the metrics data on a non-ideal field.
Questions
Understand from Ricardo where the
codigo
value on the spreadsheet comes from as we don’t see that field in the spatial data (municipalities_gadm41_PRT_2.zip).We have two separate sets of geometries with duplicate NAME_2 properties:
Lagoa
andCalheta
. ForLagoa
, there appears to be a town in the Azores, and then a city on the south coast. ForCalheta
, there appears to a region on the southeastern half of Sao Jorge island and another on the souther half of the western side of Madeira island. So these are definitely separate communities.See the screenshot of one of the
![image (2)](https://github.com/developmentseed/tecnico-energy-app/assets/10764915/5de2866f-8ecb-48aa-aa34-6ed590123f57)
Calheta
geometry properties, the metric data showscodigo=3101
for Calheta but we don’t have any property like that.Other Notes
@alukach highlighted data that we added in the spreadsheet in yellow and any columns/keys that we needed to rename in fuchsia
![image (4)](https://github.com/developmentseed/tecnico-energy-app/assets/10764915/241bf65e-d5db-479c-8272-dc0127d641bb)