nasa / cmr-stac

Other
55 stars 21 forks source link

HLS STAC "Oops!" error occurs daily at around 2 pm pacific time. #366

Open matus-hodul opened 4 weeks ago

matus-hodul commented 4 weeks ago

We noticed that the HLS “Oops!” error* seems to occur at roughly the same time every day, so we created a script which checks the status of HLS throughout the day. Indeed, the oops error occurs every day around 2 pm pacific time, and lasts for a couple hours, after which it resolves itself. There are other periods of oops error during the day as well, but these are shorter and don’t seem to follow a specific pattern.

Screenshot 2024-10-29 at 8 59 27 AM

* full error is: pystac_client.exceptions.APIError "Oops! Something has gone wrong. We have been alerted and are working to resolve the problem. Please try your request again later."

import sys, os
from pystac_client import Client
import time

def checkHLS():
    try: # run a large HLS Query:
        url = r'https://cmr.earthdata.nasa.gov/stac/LPCLOUD'
        cat = Client.open(url)

        geom = {
                'type': 'Polygon',
                'coordinates': [[[-121.48363199406296, 50.58244919445883], # for an ROI near Cache Creek, BC.
                                [-121.44546042608697, 50.58244919445883], # normally these coords are derived from a .shp
                                [-121.44546042608697, 50.700797792110535], # but I didn't include that part in this demo.
                                [-121.48363199406296, 50.700797792110535],
                                [-121.48363199406296, 50.58244919445883]]]
        }

        params = {
            'intersects': geom,
            'collections': ['HLSL30.v2.0'],
            'datetime': '2018-01-01/2024-01-01'
        }

        search = cat.search(**params)
        items = search.item_collection()

        # If it works, return a status code of 1
        return 1, None, None

    except: # if it doesn't work, return a status code of 0 and the error message
        the_type, the_value, the_traceback = sys.exc_info() # https://stackoverflow.com/a/19406123

        return 0, the_type, the_value
dyu-bot commented 3 weeks ago

Adding on that I've also been noticing more of this "oops" error come up intermittently (although now I'm getting it more permanently). It was originally reported here in this thread.

aliciaaleman commented 3 weeks ago

Thanks for posting this issue in the repo. We're looking into it, but so far, it doesn't appear to be a CMR-STAC issue but rather something happening upstream with CMR.

waltersdan commented 2 weeks ago

Hello, Based on the previous comment, is this still being looked at from the CMR-STAC team? If it's an upstream issue, is there somewhere else we should raise it? This bug has been breaking HLS STAC functionality regularly since the beginning of September - is it possible to revert whatever change was made at that time? Thanks!

aliciaaleman commented 2 weeks ago

Hi @waltersdan - These issues appear to be caused by periodic bursts of user-driven activity (e.g., scripts harvesting a large amount of data). We're working alongside our operations and infrastructure teams to identify a solution. I'll post another update here when I have more information on a timeline for resolution.

waltersdan commented 2 weeks ago

Ok, thanks for the update!

SethDocherty commented 1 week ago

@aliciaaleman, I've had success with the cloudstac endpoint vs the stac endpoint (as noted here).

I understand that the cloudstac endpoint only contains STAC collections available in the cloud, but is there any reason I should be using stac instead to get HLSS30.v2.0 metadata? I did a quick test comparing search results (similar to @matus-hodul search params) between both endpoints and didn't see a difference, so I assume using either endpoint is fine.

Other than the update to the collection id, the only difference I see is the API specs.

cloudstac endpoint

Conformance Classes OGC OGC API - Features - Part 1 - Core 1.0 OGC API - Features - Part 1 - Oas30 1.0 OGC API - Features - Part 1 - Geojson 1.0 STAC Core v1.0.0-beta.1 Item Search v1.0.0-beta.1 Item Search - Fields v1.0.0-beta.1 Item Search - Query v1.0.0-beta.1 Item Search - Sort v1.0.0-beta.1 Item Search - Context v1.0.0-beta.1

stac endpoint

Conformance Classes OGC OGC API - Features - Part 1 - Core 1.0 OGC API - Features - Part 1 - Oas30 1.0 OGC API - Features - Part 1 - Geojson 1.0 OGC API - Common - Part 1 - Simple Query 1.0 STAC Core v1.0.0-rc.2 Item Search v1.0.0-rc.2 Ogcapi Features v1.0.0-rc.2 Item Search - Fields v1.0.0-rc.2 Item Search - Features v1.0.0-rc.2 Item Search - Query v1.0.0-rc.2 Item Search - Sort v1.0.0-rc.2 Item Search - Context v1.0.0-rc.2 Collection Search v1.0.0-rc.2 Collection Search - Free Text v1.0.0-rc.2 Collection Search - Sort v1.0.0-rc.2