gopeg / feedback

Feedback collected in GO-PEG use case data harmonisation
Creative Commons Zero v1.0 Universal
1 stars 4 forks source link

POPIMPACT-Iterative processing of data sets #17

Open MayteTDGeograma opened 2 years ago

MayteTDGeograma commented 2 years ago
name about title labels assignees
Iterative data processing Description of the iterative processes in the use case. Diversity of imput data - -

Iterative data processing: Project in which this issue occured

POPIMPACT

Tool used

Hale studio 4.1.0, win64, 32 GB de RAM
postgreSQL 14+ postGIS 3.1, win64, 32 GB de RAM

Description

DIVERSITY OF IMPUT DATA There are several problems that we can find in POPIMPACT due to the fact that the source data is very diverse: • Segmented data: When data sets are highly segmented, a process has to be done to bring them together in order to process them as a whole.
• Diversity of formats: Each dataset can be in a different format.

DIFFERENT EPSGs Each data has a different EPSG and even the same data in different areas.

Type of issue

Repetitive: Large amounts of repetition in the process

Workflow

DIVERSITY OF IMPUT DATA

With Hale studio (4.1.0 or higher) we can select a source data, and then we can selected multiple files during the schema import or when importing source data.

image

Supported formats are:

image

Input source guarantees that the data is harmonized in Hale before being loaded into the database.

DIFFERENT EPSGs In the normalization process (into a postGIS function) the EPSG of the data is verified, if it is not the same as the project, the information is reprojected

If EPSGBuilding!= EPSGProyect then  
      GeometryBuilding:='st_transform((ST_Dump(edif."'||GeometryBuilding||'")).geom, '||EPSGProyect||')';   
      GeometryBuildingPU:='st_transform(ST_PointOnSurface((ST_Dump(edif."'||GeometryBuilding||'")).geom),
      '||EPSGProyect||')';  
ELSE 
      GeometryBuilding:='(ST_Dump(edif."'||GeometryBuilding||'")).geom ';
      GeometryBuildingPU:='ST_PointOnSurface('||GeometryBuilding||')';          
END IF;

Impact

This new Hale functionality solves a previous problem when loading a large dataset with the same model into POPIMPACT.

Related issue(s)

POPIMPACT - Data harmonisation #16

thorsten-reitz commented 1 year ago

Thanks for the feedback.

On the spatial reference info: If you use the multi file import, the same settings are applied to all files imported at the same time. If you need to import data with different CRS definitions, you would have to import them batch wise.

Alternatively, you can automate this process via hale-cli.