jjrom / resto

A metadata catalog and search engine for geospatialized data
Apache License 2.0
56 stars 19 forks source link

Root catalog has invalid STAC object somewhere in the child & item link relation tree #356

Closed philvarner closed 1 year ago

philvarner commented 1 year ago

Describe the bug

Somewhere in the Catalog child & item link relation tree starting at https://tamn.snapplanet.io there is an invalid STAC object.

Running this code, which traverses those links to retrieve items:

import pystac
list(pystac.Catalog.from_file("https://tamn.snapplanet.io/").get_all_items())

results in pystac.errors.STACTypeError: JSON does not represent a STAC object.

There's not a lot of information from that error, so it might require stepping through in a debugger.

jjrom commented 1 year ago

Effectively corrected in commit https://github.com/jjrom/resto/commit/aae5294f50f80fa3e24d2999de2eaa224527f0e7

jjrom commented 1 year ago

However, brings a pystac error "UnicodeEncodeError: 'ascii' codec can't encode character '\u0306' in position 111: ordinal not in range(128)". Not a resto error imho

philvarner commented 1 year ago

I let the pystac team know about this. However, I can't reproduce it locally.

gadomski commented 1 year ago

Looks like a sub-catalog has a non-ascii character in its id: https://tamn.snapplanet.io/catalogs/classifications/geographical/continent/continent:Asia:6255147/country:SriLanka:1227603/region:NaĕGenahiraPalata:. This breaks GET requests. IMO this is a resto bug, because the link's href should be percent-encoded?

Here's the link that produced the error during resolution:

{
  "rel": "child",
  "href": "https://tamn.snapplanet.io/catalogs/classifications/geographical/continent/continent:Asia:6255147/country:SriLanka:1227603/region:NaĕGenahiraPalata:",
  "type": "application/json",
  "title": "Næ̆gĕnahira paḷāta",
  "matched": 1138
}
Source catalog ```json { "type": "Catalog", "id": "SriLanka", "stac_version": "1.0.0", "description": "Search on SriLanka", "links": [ { "rel": "root", "href": "https://tamn.snapplanet.io", "type": "application/json", "title": "Welcome to resto STAC" }, { "rel": "items", "href": "https://tamn.snapplanet.io/catalogs/classifications/geographical/continent/continent:Asia:6255147/country:SriLanka:1227603/_", "type": "application/json", "title": "Sri Lanka" }, { "rel": "child", "href": "https://tamn.snapplanet.io/catalogs/classifications/geographical/continent/continent:Asia:6255147/country:SriLanka:1227603/region:BasnahiraPalata:", "type": "application/json", "title": "Basnāhira paḷāta", "matched": 513 }, { "rel": "child", "href": "https://tamn.snapplanet.io/catalogs/classifications/geographical/continent/continent:Asia:6255147/country:SriLanka:1227603/region:DakunuPalata:", "type": "application/json", "title": "Dakuṇu paḷāta", "matched": 754 }, { "rel": "child", "href": "https://tamn.snapplanet.io/catalogs/classifications/geographical/continent/continent:Asia:6255147/country:SriLanka:1227603/region:MadhyamaPalata:", "type": "application/json", "title": "Madhyama paḷāta", "matched": 1026 }, { "rel": "items", "href": "https://tamn.snapplanet.io/catalogs/classifications/geographical/continent/continent:Asia:6255147/country:SriLanka:1227603/region:NaegenahiraPalata:", "type": "application/json", "title": "Næ̆gĕnahira paḷāta", "matched": 1 }, { "rel": "child", "href": "https://tamn.snapplanet.io/catalogs/classifications/geographical/continent/continent:Asia:6255147/country:SriLanka:1227603/region:NaĕGenahiraPalata:", "type": "application/json", "title": "Næ̆gĕnahira paḷāta", "matched": 1138 }, { "rel": "child", "href": "https://tamn.snapplanet.io/catalogs/classifications/geographical/continent/continent:Asia:6255147/country:SriLanka:1227603/region:SabaragamuvaPalata:", "type": "application/json", "title": "Sabaragamuva paḷāta", "matched": 512 }, { "rel": "items", "href": "https://tamn.snapplanet.io/catalogs/classifications/geographical/continent/continent:Asia:6255147/country:SriLanka:1227603/region:UturumaedaPalata:", "type": "application/json", "title": "Uturumæ̆da paḷāta", "matched": 1 }, { "rel": "child", "href": "https://tamn.snapplanet.io/catalogs/classifications/geographical/continent/continent:Asia:6255147/country:SriLanka:1227603/region:UturumaĕDaPalata:", "type": "application/json", "title": "Uturumæ̆da paḷāta", "matched": 778 }, { "rel": "child", "href": "https://tamn.snapplanet.io/catalogs/classifications/geographical/continent/continent:Asia:6255147/country:SriLanka:1227603/region:UturuPalata:", "type": "application/json", "title": "Uturu paḷāta", "matched": 512 }, { "rel": "child", "href": "https://tamn.snapplanet.io/catalogs/classifications/geographical/continent/continent:Asia:6255147/country:SriLanka:1227603/region:UvaPalata:", "type": "application/json", "title": "Ūva paḷāta", "matched": 768 }, { "rel": "child", "href": "https://tamn.snapplanet.io/catalogs/classifications/geographical/continent/continent:Asia:6255147/country:SriLanka:1227603/region:VayambaPalata:", "type": "application/json", "title": "Vayamba paḷāta", "matched": 517 }, { "rel": "self", "href": "https://tamn.snapplanet.io/catalogs/classifications/geographical/continent/continent:Asia:6255147/country:SriLanka:1227603", "type": "application/json" }, { "rel": "parent", "href": "https://tamn.snapplanet.io/catalogs/classifications/geographical/continent/continent:Asia:6255147", "type": "application/json", "title": "Asia" } ], "title": "Sri Lanka" } ```
jjrom commented 1 year ago

@gadomski Thanks for adding context to the issue. You're right that resto did not encode the url path. This is corrected in #360 and tamn.snapplanet.io is updated accordingly