stac-utils / pystac-client

Python client for searching STAC APIs
https://pystac-client.readthedocs.io
Other
161 stars 48 forks source link

Optionally ignore errors during iteration #680

Closed EmanuelCastanho closed 5 months ago

EmanuelCastanho commented 5 months ago

I have this piece of code:

results_data = []
for item in results.items():
    results_data.append([item.id, item.datetime, item.bbox, str(item.links[3]).split('=')[2][:-1]])

At some point in the loop it gives me this error: pystac_client.exceptions.APIError: <!doctype html><html lang="en"><head><title>HTTP Status 400 – Bad Request</title><style type="text/css">body {font-family:Tahoma,Arial,sans-serif;} h1, h2, h3, b {color:white;background-color:#525D76;} h1 {font-size:22px;} h2 {font-size:16px;} h3 {font-size:14px;} p {font-size:12px;} a {color:black;} .line {height:1px;background-color:#525D76;border:none;}</style></head><body><h1>HTTP Status 400 – Bad Request</h1></body></html>

I understand there is some item that is making the loop to break, but how to ignore this item in the generator?

gadomski commented 5 months ago

Try something like this:

for item in results.items():
    try:
        results_data.append([item.id, item.datetime, item.bbox, str(item.links[3]).split('=')[2][:-1]])
    except Exception as err:
        print(f"error during iteration: {err}")
EmanuelCastanho commented 5 months ago

Hi @gadomski ,

I tried that already and gives the same error. I think the problem is inside the items() generator. I never worked with generators, but do you think the solution might be here: https://stackoverflow.com/questions/11366064/handle-an-exception-thrown-in-a-generator ?

gadomski commented 5 months ago

My apologies, I didn't think too hard about the problem ... that's my bad. It may be nice to have a ignore_errors flag in the generator functions. I'm converting this issue into an enhancement request for that.

EmanuelCastanho commented 5 months ago

Thank you!

Probably something like this:

def items(self, ignore_errors: bool = False) -> Iterator[Item]:
    """Iterator that yields :class:`pystac.Item` instances for each item matching
    the given search parameters.

    Args:
        ignore_errors (bool): Whether to ignore errors. Default is False.
    """
    for item in self.items_as_dicts():
        try:
            # already signed in items_as_dicts
            yield Item.from_dict(item, root=self.client, preserve_dict=False)
        except Exception as e:
            if not ignore_errors:
                raise e 
            pass
gadomski commented 5 months ago

@EmanuelCastanho I started on an implementation, but it wasn't feeling right. I'm not sure that an ignore_errors argument is the right solution to the problem. Even if we ignore that 400 - Bad Request error, the iterator still has to exit immediately — there's no "next page" for the iterator to fetch from the server. I think your best course of action would be:

  1. Figure out why you're getting a 400 - Bad request response in the first place, and try to fix that
  2. Accumulate items until you run into the error, e.g.:
items = list()
try:
    for item in results.items():
        items.append(item)
except APIError:
    pass
# you've gotten as many items as you can
EmanuelCastanho commented 5 months ago

I was checking this just now and I will see if I can deal with the error.

Meanwhile, the error can be replicated with:

api = Client.open("https://eocat.esa.int/eo-catalogue/")
results = api.search(method = "GET",
                     collections = ["GEOSAT-2.Portugal.Coverage", "GEOSAT2SpainCoverage10"],
                     max_items = None,
                     filter = "instrument = 'HiRAIS' and sensorMode = 'PM4'")
results_data = []
for item in results.items():
     results_data.append([item.id])

Unfortunately, I don't know which product is causing the crash, so the example has 5046 products, the error happens at 4840.

gadomski commented 5 months ago

Looks like it's a problem with the server — it's generating really long next links as you get further in the iteration. I ran your example myself (with a couple tweaks) and it eventually errored w/ a really long link.

The script:

import json

import tqdm

from pystac_client import Client

api = Client.open("https://eocat.esa.int/eo-catalogue/")
results = api.search(
    method="GET",
    collections=["GEOSAT-2.Portugal.Coverage", "GEOSAT2SpainCoverage10"],
    max_items=None,
    filter="instrument = 'HiRAIS' and sensorMode = 'PM4'",
)
progress_bar = tqdm.tqdm()
try:
    last_page = None
    for page in results.pages_as_dicts():
        progress_bar.total = page["numberMatched"]
        progress_bar.update(page["numberReturned"])
        last_page = page
except:
    print(json.dumps(last_page))

Going to close this issue as not-a-problem-with-pystac-client — you can use this issue as a reference if you contact that API's maintainers to tell them about this issue.

The really long link https://eocat.esa.int/eo-catalogue/search?collections=GEOSAT-2.Portugal.Coverage%2CGEOSAT2SpainCoverage10&filter=instrument+%3D+%27HiRAIS%27+and+sensorMode+%3D+%27PM4%27&filter-lang=cql2-text&status=695,4351&status=695,4351&status=695,4351&status=695,4351&status=695,4351&status=695,4351&status=695,4351&status=695,4351&status=695,4351&status=695,4351&status=695,4351&status=695,4351&status=695,4351&status=695,4351&status=695,4351&status=695,4351&status=695,4351&status=695,4351&status=695,4351&status=695,4351&status=695,4351&status=695,4351&status=695,4351&status=695,4351&status=695,4351&status=695,4351&status=695,4351&status=695,4351&status=695,4351&status=695,4351&status=695,4351&status=695,4351&status=695,4351&status=695,4351&status=695,4351&status=695,4351&status=695,4351&status=695,4351&status=695,4351&status=695,4351&status=695,4351&status=695,4351&status=695,4351&status=695,4351&status=695,4351&status=695,4351&status=695,4351&status=695,4351&status=695,4351&status=695,4351&status=695,4351&status=695,4351&status=695,4351&status=695,4351&status=695,4351&status=695,4351&status=695,4351&status=695,4351&status=695,4351&status=695,4351&status=695,4351&status=695,4351&status=695,4351&status=695,4351&status=695,4351&status=695,4351&status=695,4351&status=695,4351&status=695,4351&status=695,4351&status=695,4351&status=695,4351&status=695,4351&status=695,4351&status=695,4351&status=695,4351&status=695,4351&status=695,4351&status=695,4351&status=695,4351&status=695,4351&status=695,4351&status=695,4351&status=695,4351&status=695,4351&status=695,4351&status=695,4351&status=695,4351&status=695,4351&status=695,4351&status=695,4351&status=695,4351&status=695,4351&status=695,4351&status=695,4351&status=695,4351&status=695,4351&status=695,4351&status=695,4351&status=695,4351&status=695,4351&status=695,4351&status=695,4351&status=695,4351&status=695,4351&status=695,4351&status=695,4351&status=695,4351&status=695,4351&status=695,4351&status=695,4351&status=695,4351&status=695,4351&status=695,4351&status=695,4351&status=695,4351&status=695,4351&status=695,4351&status=695,4351&status=695,4351&status=695,4351&status=695,4351&status=695,4351&status=695,4351&status=695,4351&status=695,4351&status=695,4351&status=695,4351&status=695,4351&status=695,4351&status=695,4351&status=695,4351&status=695,4351&status=695,4351&status=695,4351&status=695,4351&status=695,4351&status=695,4351&status=695,4351&status=695,4351&status=695,4351&status=695,4351&status=695,4351&status=695,4351&status=695,4351&status=695,4351&status=695,4351&status=695,4351&status=695,4351&status=695,4351&status=695,4351&status=695,4351&status=695,4351&status=695,4351&status=695,4351&status=695,4351&status=695,4351&status=695,4351&status=695,4351&status=695,4351&status=695,4351&status=695,4351&status=695,4351&status=695,4351&status=695,4351&status=695,4351&status=695,4351&status=695,4351&status=695,4351&status=695,4351&status=695,4351&status=695,4351&status=695,4351&status=695,4351&status=695,4351&status=695,4351&status=695,4351&status=695,4351&status=695,4351&status=695,4351&status=695,4351&status=695,4351&status=695,4351&status=695,4351&status=695,4351&status=695,4351&status=695,4351&status=695,4351&status=695,4351&status=695,4351&status=695,4351&status=695,4351&status=695,4351&status=695,4351&status=695,4351&status=695,4351&status=695,4351&status=695,4351&status=695,4351&status=695,4351&status=695,4351&status=695,4351&status=695,4351&status=695,4351&status=695,4351&status=695,4351&status=695,4351&status=695,4351&status=695,4351&status=695,4351&status=695,4351&status=695,4351&status=695,4351&status=695,4351&status=695,4351&status=695,4351&status=695,4351&status=695,4351&status=695,4351&status=695,4351&status=695,4351&status=695,4351&status=695,4351&status=695,4351&status=695,4351&status=695,4351&status=695,4351&status=695,4351&status=695,4351&status=695,4351&status=695,4351&status=695,4351&status=695,4351&status=695,4351&status=695,4351&status=695,4351&status=695,4351&status=695,4351&status=695,4351&status=695,4351&status=695,4351&status=695,4351&status=695,4351&status=695,4351&status=695,4351&status=695,4351&status=695,4351&status=695,4351&status=695,4351&status=695,4351&status=695,4351&status=695,4351&status=695,4351&status=695,4351&status=695,4351&status=695,4351&status=695,4351&status=695,4351&status=695,4351&status=695,4351&status=695,4351&status=695,4351&status=695,4351&status=695,4351&status=695,4351&status=695,4351&status=695,4351&status=695,4351&status=695,4351&status=695,4351&status=695,4351&status=695,4351&status=695,4351&status=695,4351&status=695,4351&status=695,4351&status=695,4351&status=695,4351&status=695,4351&status=695,4351&status=695,4351&status=695,4351&status=695,4351&status=695,4351&status=695,4351&status=695,4351&status=695,4351&status=695,4351&status=695,4351&status=695,4351&status=695,4351&status=695,4351&status=695,4351&status=695,4351&status=695,4351&status=695,4351&status=695,4351&status=695,4351&status=695,4351&status=695,4351&status=695,4351&status=695,4351&status=695,4351&status=695,4351&status=695,4351&status=695,4351&status=695,4351&status=695,4351&status=695,4351&status=695,4351&status=695,4351&status=695,4351&status=695,4351&status=695,4351&status=695,4351&status=695,4351&status=695,4351&status=695,4351&status=695,4351&status=695,4351&status=695,4351&status=695,4351&status=695,4351&status=695,4351&status=695,4351&status=695,4351&status=695,4351&status=695,4351&status=695,4351&status=695,4351&status=695,4351&status=695,4351&status=695,4351&status=695,4351&status=695,4351&status=695,4351&status=695,4351&status=695,4351&status=695,4351&status=695,4351&status=695,4351&status=695,4351&status=695,4351&status=695,4351&status=695,4351&status=695,4351&status=695,4351&status=695,4351&status=695,4351&status=695,4351&status=695,4351&status=695,4351&status=695,4351&status=695,4351&status=695,4351&status=695,4351&status=695,4351&status=695,4351&status=695,4351&status=695,4351&status=695,4351&status=695,4351&status=695,4351&status=695,4351&status=695,4351&status=695,4351&status=695,4351&status=695,4351&status=695,4351&status=695,4351&status=695,4351&status=695,4351&status=695,4351&status=695,4351&status=695,4351&status=695,4351&status=695,4351&status=695,4351&status=695,4351&status=695,4351&status=695,4351&status=695,4351&status=695,4351&status=695,4351&status=695,4351&status=695,4351&status=695,4351&status=695,4351&status=695,4351&status=695,4351&status=695,4351&status=695,4351&status=695,4351&status=695,4351&status=695,4351&status=695,4351&status=695,4351&status=695,4351&status=695,4351&status=695,4351&status=695,4351&status=695,4351&status=695,4351&status=695,4351&status=695,4351&status=695,4351&status=695,4351&status=695,4351&status=695,4351&status=695,4351&status=695,4351&status=695,4351&status=695,4351&status=695,4351&status=695,4351&status=695,4351&status=695,4351&status=695,4351&status=695,4351&status=695,4351&status=695,4351&status=695,4351&status=695,4351&status=695,4351&status=695,4351&status=695,4351&status=695,4351&status=695,4351&status=695,4351&status=695,4351&status=695,4351&status=695,4351&status=695,4351&status=695,4351&status=695,4351&status=695,4351&status=695,4351&status=695,4351&status=695,4351&status=695,4351&status=695,4351&status=695,4351&status=695,4351&status=695,4351&status=695,4351&status=695,4351&status=695,4351&status=695,4351&status=695,4351&status=695,4351&status=695,4351&status=695,4351&status=695,4351&status=695,4351&status=695,4351&status=695,4351&status=695,4351&status=695,4351&status=695,4351&status=695,4351&status=695,4351&status=695,4351&status=695,4351&status=695,4351&status=695,4351&status=695,4351&status=695,4351&status=695,4351&status=695,4351&status=695,4351&status=695,4351&status=695,4351&status=695,4351&status=695,4351&status=695,4351&status=695,4351&status=695,4351&status=695,4351&status=695,4351&status=695,4351&status=695,4351&status=695,4351&status=695,4351&status=695,4351&startRecord=4841