Open HassanMasoomi opened 1 year ago
Either data has been updated, or the slightly different approach helps, but this works fine for Belgium as of January 2024:
import planetary_computer
import pystac_client
import dask_geopandas
import deltalake
LOCATION = "Belgium"
def get_table_and_credentials():
catalog = pystac_client.Client.open(
"https://planetarycomputer.microsoft.com/api/stac/v1",
modifier=planetary_computer.sign_inplace,
)
collection = catalog.get_collection("ms-buildings")
asset = collection.assets["delta"]
storage_options = {
"account_name": asset.extra_fields["table:storage_options"]["account_name"],
"sas_token": asset.extra_fields["table:storage_options"]["credential"],
}
# Set up DeltaTable to query URIs
table = deltalake.DeltaTable(asset.href, storage_options=storage_options)
return table, storage_options
# Storage options only last for so long - if they need to be reused,
# suggest try/except around any data read, and retry after re-requesting
table, storage_options = get_table_and_credentials()
# Query based on RegionName or quadkey: (key, "=", value) and/or (key, "in", values)
uris = table.file_uris([("RegionName", "=", LOCATION)])
# Read into a dask-geopandas dataframe
df = dask_geopandas.read_parquet(uris, storage_options=storage_options)
# then process as before
s = df.representative_point()
ss = pd.DataFrame(data = {'lng': s.x, 'lat': s.y})
ss.to_csv(f"{LOCATION}_MS_Bldgs.csv", index=False)
I am trying to DL a representative Lat/Long for each footprint. Using the STAC API in the code chunk 1 below fails to get data for several countries (basically, the parameter "items" does not read the full info it needs normally). So, I tried the code chunk 2 to get what I wanted (but it's so slow to do so). Any reason why those info missing for some countries?
Code chunk 1
Code chunk 2