mintproject / data-catalog

0 stars 0 forks source link

Update polygon point coverage fails #11

Open mosoriob opened 1 year ago

mosoriob commented 1 year ago

The dataset has one resource:

postgres=# select * from resources where dataset_id='5ca18d80-dd88-4f95-9e1c-665289756748';
-[ RECORD 1 ]-+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------
data_url      | https://data.mint.isi.edu/files/cycles-input-data/africa/soil_weather/cycles-6.87500-37.87500.soil_weather.zip
resource_type | Zip
json_metadata | {"spatial_coverage": {"type": "Point", "value": {"x": 37.0, "y": 6.0}}, "temporal_coverage": {"end_time": "2021-03-30T00:00:00", "start_time": "2000-01-01T00:00:00"}}
layout        | {}
created_at    | 2023-06-09 15:33:12.448088
updated_at    | 2023-06-09 15:33:12.448088
name          | cycles-6.87500-37.87500.soil_weather.zip
id            | 202c581c-7e81-4933-afad-28ef1d320118
dataset_id    | 5ca18d80-dd88-4f95-9e1c-665289756748
provenance_id | 9ef60317-5da5-4050-8bbc-7d6826fee49f
is_queryable  | t

The sync_dataset_metadata triggers the function _update_polygon_point_coverage_ds. However, the query doesn't work

                with spatial_coverage as (
                    select
                        dataset_id,
                        ST_union(ST_Simplify(st_buffer(spatial_coverage_index.spatial_coverage, 0.1), 0.1)) as dataset_spatial_coverage
                    from resources
                    inner join spatial_coverage_index on resources.id = spatial_coverage_index.indexed_id
                    --and resources.is_queryable is TRUE
                    where st_geometrytype(spatial_coverage_index.spatial_coverage) like '%Point'
                    and dataset_id='{dsid}'
                    group by dataset_id
                )
                update datasets
                SET spatial_coverage = sc.dataset_spatial_coverage
                from spatial_coverage sc
                where sc.dataset_id = datasets.id"""

Inspecting the issue, the SELECT is returning an empty dataset_spatial_coverage value

                select
                        dataset_id,
                        ST_union(ST_Simplify(st_buffer(spatial_coverage_index.spatial_coverage, 0.1), 0.1)) as dataset_spatial_coverage
                    from resources
                    inner join spatial_coverage_index on resources.id = spatial_coverage_index.indexed_id
                    --and resources.is_queryable is TRUE
                    where st_geometrytype(spatial_coverage_index.spatial_coverage) like '%Point'
                    and dataset_id='5ca18d80-dd88-4f95-9e1c-665289756748'
                    group by dataset_id

postgres-#                 group by dataset_id;
-[ RECORD 1 ]------------+-------------------------------------
dataset_id               | 5ca18d80-dd88-4f95-9e1c-665289756748
dataset_spatial_coverage | 
mosoriob commented 1 year ago

The issue is related to parameters: 0.1 , 0.1

mosoriob commented 1 year ago

cc: @varunratnakar

mosoriob commented 1 year ago

I propose using a simpler method

f you have multiple points and you want to enclose them in a polygon, you can use the ST_ConvexHull function in PostGIS. This function creates the minimum convex polygon that contains all the points.

Here's an example of how you can do it:

SELECT ST_ConvexHull(ST_Collect(geom)) AS polygon
FROM your_table;

Replace your_table with the actual name of your table and geom with the name of the column that contains the points.

Explanation of the example:

ST_Collect(geom) groups all the points into a multipoint geometry. ST_ConvexHull creates the minimum convex polygon that encloses the grouped points. The query will return a polygon that represents the minimal boundary containing all the points in your table. You can adjust the query according to your needs, such as adding additional conditions in the WHERE clause to filter the points or including other columns in the SELECT clause.