microsoft / planetary-computer-apis

Planetary Computer APIs
MIT License
105 stars 26 forks source link

Unable to dump collection from azurite container #112

Open maybe78 opened 2 years ago

maybe78 commented 2 years ago

I've successfully deployed dev environment and check that naip collection exist. I've tried to dump collection from inner azurite storage, following steps described in docs/collection-config.md using pcapis from ./scripts/console

pcapis dump -t collection --account=devstoreaccount1 --table=collectionconfig --sas=$SAS --output=collectionconfig.json --account-url=http://*.*.*.*:10002/devstoreaccount1

and get following error:

root@ed21df10fee0:/opt/src# pcapis dump -t collection --account=devstoreaccount1 --table=collectionconfig --sas=$SAS --output=collectionconfig.json --account-url=http://*.*.*.*:10002/devstoreaccount1
Traceback (most recent call last):
  File "/usr/local/lib/python3.9/site-packages/azure/data/tables/_models.py", line 363, in _get_next_cb
    return self._command(
  File "/usr/local/lib/python3.9/site-packages/azure/data/tables/_generated/operations/_table_operations.py", line 386, in query_entities
    raise HttpResponseError(response=response, model=error)
azure.core.exceptions.HttpResponseError: Operation returned an invalid status 'Bad Request'
Content: {"odata.error":{"code":"InvalidInput","message":{"lang":"en-US","value":"The query condition specified in the request is invalid.\nRequestId:83d710d6-6f7e-4f91-98f6-d64218398d4a\nTime:2022-07-22T09:19:56.753Z"}}}

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/usr/local/bin/pcapis", line 33, in <module>
    sys.exit(load_entry_point('pccommon', 'console_scripts', 'pcapis')())
  File "/opt/src/pccommon/pccommon/cli.py", line 234, in cli
    return dump(**args)
  File "/opt/src/pccommon/pccommon/cli.py", line 67, in dump
    for (_, collection_id, col_config) in col_config_table.get_all():
  File "/opt/src/pccommon/pccommon/tables.py", line 204, in get_all
    for entity in table_client.query_entities(""):
  File "/usr/local/lib/python3.9/site-packages/azure/core/paging.py", line 128, in __next__
    return next(self._page_iterator)
  File "/usr/local/lib/python3.9/site-packages/azure/core/paging.py", line 76, in __next__
    self._response = self._get_next(self.continuation_token)
  File "/usr/local/lib/python3.9/site-packages/azure/data/tables/_models.py", line 372, in _get_next_cb
    _process_table_error(error)
  File "/usr/local/lib/python3.9/site-packages/azure/data/tables/_error.py", line 153, in _process_table_error
    _reraise_error(decoded_error)
  File "/usr/local/lib/python3.9/site-packages/azure/data/tables/_error.py", line 145, in _reraise_error
    raise decoded_error.with_traceback(exc_traceback)
  File "/usr/local/lib/python3.9/site-packages/azure/data/tables/_models.py", line 363, in _get_next_cb
    return self._command(
  File "/usr/local/lib/python3.9/site-packages/azure/data/tables/_generated/operations/_table_operations.py", line 386, in query_entities
    raise HttpResponseError(response=response, model=error)
azure.core.exceptions.HttpResponseError: The query condition specified in the request is invalid.
RequestId:83d710d6-6f7e-4f91-98f6-d64218398d4a
Time:2022-07-22T09:19:56.753Z
ErrorCode:InvalidInput
Content: {"odata.error":{"code":"InvalidInput","message":{"lang":"en-US","value":"The query condition specified in the request is invalid.\nRequestId:83d710d6-6f7e-4f91-98f6-d64218398d4a\nTime:2022-07-22T09:19:56.753Z"}}}

pcapis load function works correctly:

pcapis load -t collection --account=devstoreaccount1 --table=collectionconfig --sas=$SAS --file=alos-palsar-config.json --account-url=http://localhost:10002/devstoreaccount1

I can see that collection appeared in collectionconfig table in inner storage, but it is invisible in explorer and also is missing in stac api request http://*.*.*.*:8080/stac/collections/alos-palsar-mosaic

{"code":"NotFoundError","description":"No collection with id 'alos-palsar-mosaic' found!"}

P. S. Still having a lot of troubles as a lack of documentation maybe I can contact someone from your team, and contribute a better tutorial for beginners in future?)

mmcfarland commented 2 years ago

Based on the issue, it looks like you may be expecting the STAC collections to be stored within the Azure Table/Azurite system. STAC metadata is stored within the PostgreSQL database, using pgSTAC. The data stored within Azurite is configuration used for non-STAC spec information like our visualizations and mosaics. To load new collections/items, you'll want to refer to the pgSTAC documentation (which is still being developed).

In general, this Planetary Computer repo shows how we've assembled and extended the underlying open-source tools (stacfast-api, pgstac, pstac-titiler) and deployed them on Azure, as a reference implementation. You may want to reference those tools directly to understand their usage in the Planetary Computer. For the PALSAR data specifically, you can see the tooling we used to create Collection and Item metadata via the stactools package.