WorldCereal / worldcereal-classification

This repository contains the classification module of the WorldCereal system.
https://esa-worldcereal.org/
MIT License
19 stars 2 forks source link

Integrate duckDB queries into worldcereal-classification methodology #64

Closed kvantricht closed 2 months ago

kvantricht commented 3 months ago

@cbutsko please mention here the code snippet we want to integrate.

cbutsko commented 3 months ago
def get_spatial_subset(bbox_poly):
    db = duckdb.connect()
    # db.sql('INSTALL spatial')
    db.load_extension('spatial')

    parquet_fpath = 's3://geoparquet/worldcereal_extractions_phase1/*/*.parquet'

    query_df = db.sql(f"""
    set s3_endpoint='s3.waw3-1.cloudferro.com';
    select *
    from read_parquet('{parquet_fpath}', hive_partitioning = 1) original_data
    where st_within(ST_Point(original_data.lon, original_data.lat), ST_GeomFromText('{bbox_poly.wkt}'))
    """).df()
    return query_df