Open anayeaye opened 4 months ago
When the discovery endpoint is used to add items to an existing collection it fails when id_template is not provided in the request.
id_template
The id_template default value is set in the s3_discovery util.
discovery/
{ "bucket": "veda-data-store-staging", "collection": "omi-19-item-collection-deleteme", "datetime_range": "year", "discovery": "s3", "filename_regex": "^(.*).tif$", "prefix": "OMI_trno2-COG/", }
AIRFLOW_CTX_DAG_OWNER=airflow AIRFLOW_CTX_DAG_ID=veda_discover AIRFLOW_CTX_TASK_ID=subdag_discover.discover_from_s3 AIRFLOW_CTX_EXECUTION_DATE=2024-07-19T15:53:42+00:00 AIRFLOW_CTX_TRY_NUMBER=1 AIRFLOW_CTX_DAG_RUN_ID=d222b047-d453-4980-acad-f40d473320c6 [2024-07-19, 15:53:50 UTC] {{logging_mixin.py:137}} INFO - Getting S3 response iterator for bucket: veda-data-store-staging, prefix: OMI_trno2-COG/ [2024-07-19, 15:53:50 UTC] {{taskinstance.py:1768}} ERROR - Task failed with exception Traceback (most recent call last): File "/usr/local/airflow/.local/lib/python3.10/site-packages/airflow/operators/python.py", line 247, in execute condition = super().execute(context) File "/usr/local/airflow/.local/lib/python3.10/site-packages/airflow/operators/python.py", line 175, in execute return_value = self.execute_callable() File "/usr/local/airflow/.local/lib/python3.10/site-packages/airflow/operators/python.py", line 192, in execute_callable return self.python_callable(*self.op_args, **self.op_kwargs) File "/usr/local/airflow/dags/veda_data_pipeline/groups/discover_group.py", line 36, in discover_from_s3_task return s3_discovery_handler( File "/usr/local/airflow/dags/veda_data_pipeline/utils/s3_discovery.py", line 251, in s3_discovery_handler item["item_id"] = id_template.format(item["item_id"]) AttributeError: 'NoneType' object has no attribute 'format' [2024-07-19, 15:53:50 UTC] {{taskinstance.py:1318}} INFO - Marking task as FAILED. dag_id=veda_discover, task_id=subdag_discover.discover_from_s3, execution_date=20240719T155342, start_date=20240719T155349, end_date=20240719T155350 [2024-07-19, 15:53:50 UTC] {{standard_task_runner.py:100}} ERROR - Failed to execute job 2754 for task subdag_discover.discover_from_s3 ('NoneType' object has no attribute 'format'; 24409) [2024-07-19, 15:53:50 UTC] {{local_task_job.py:208}} INFO - Task exited with return code 1
discovery
@anayeaye Are you able to give me a quick TL;DR run through of this (or rather, what I need to setup to replicate it) when you're awake? 🤞
What
When the discovery endpoint is used to add items to an existing collection it fails when
id_template
is not provided in the request.Note
The id_template default value is set in the s3_discovery util.
How to reproduce
collection.json
```json { "id": "omi-19-item-collection-deleteme", "type": "Collection", "links": [], "title": "DELETE ME 19 item collection OMI_trno2", "extent": { "spatial": { "bbox": [ [-180, -90, 180, 90] ] }, "temporal": { "interval": [ [null, null] ] } }, "license": "MIT", "description": "OMI_trno2 - 0.10 x 0.10 Annual as Cloud-Optimized GeoTIFFs (COGs)", "item_assets": { "cog_default": { "type": "image/tiff; application=geotiff; profile=cloud-optimized", "roles": [ "data", "layer" ], "title": "Default COG Layer", "description": "Cloud optimized default layer to display on map" } }, "stac_version": "1.0.0", "renders": { "dashboard": { "colormap_name": "reds", "rescale": [ [ 0, 3000000000000000.0 ] ], "assets": [ "cog_default" ], "title": "VEDA Dashboard Render Parameters" } }, "providers": [ { "name": "NASA VEDA", "url": "https://www.earthdata.nasa.gov/dashboard/", "roles": [ "host" ] } ], "item_assets": { "test_asset": { "title": "An item asset description for test", "type": "image/tiff; application=geotiff; profile=cloud-optimized", "roles": ["test"] }, "cog_default": { "type": "image/tiff; application=geotiff; profile=cloud-optimized", "roles": [ "data", "layer" ], "title": "Default COG Layer", "description": "Cloud optimized default layer to display on map" } }, "assets": { "thumbnail": { "title": "Thumbnail", "description": "Photo by [Mick Truyts](https://unsplash.com/photos/x6WQeNYJC1w) (Power plant shooting steam at the sky)", "href": "https://thumbnails.openveda.cloud/no2--dataset-cover.jpg", "type": "image/jpeg", "roles": ["thumbnail"] } } } ```discovery/
request without providingid_template
in config. For the above collectionError log
AC
discovery
endpoint does not fail whenid_template
is not provided