Open MathewNWSH opened 10 months ago
'sentinel-2-l1c' should not be valid under {}. Error is in collection"
This message pretty much says in a very obscure way that the Item is missing a link with the rel type collection
to the collection that is referenced in the Item in the collection
property.
You need to add a link such as the following to the Item to pass validation:
{
"rel": "collection",
"href": "./collection.json",
"type": "application/json"
}
As you can see, in the items table, the column labeled "content" contains the "πβ»" sign: [...] even though the content of the title was defined within the base item in collections table.
I think the validation error is not what leads to the the πβ». These characters form a "magic marker" that indicates that a key should not be rehydrated: https://github.com/search?q=repo%3Astac-utils%2Fpgstac%20%F0%92%8D%9F%E2%80%BB&type=code
As far as I understand it, it's an internal marker that is not exposed to the public via the API and is intentially added to the database for deduplication purposes. I think it will be replaced with the value from the Item Assets Defintion that is defined in the corresponding Collection (internally: a base item). Did you check how stac-fastapi makes the items available through the API? I think it should output the item as expected.
Disclaimer: That's how I read the code, I've seen this marker today for the first time ;-)
Yes, we use the item_assets property of a collection to be able to be able to reduce the size of the item json that is stored per item. This can make a huge difference in size on items that have large amounts of assets where every item has some number of the same properties. In instances where there is an asset type in the collections item_assets that is not present in an items assets, we use the "πβ»" marker to indicate that we should not pull that property in from the item_assets on the collections (we use the base_item "view" of the collection to coerce the properties from the collection to look like an items json).
Additionally, we strip the geometry, id, collection, and type from the item json to reduce disk space as those fields are promoted to actual columns in Postgres.
In the code, you can see this process referred to as hydration/dehydration going between the external json representation of a STAC Item and the pgstac internal storage.
The only validation that happens in pgstac is:
@MathewNWSH, have you been able to resolve your issue? Is there anything to follow-up on?
Hello, I managed to create a collection definition json and a sample item json. The sample item was created using a script inspired by https://github.com/dlr-eoc/EOmetadataTool. Then I found the Sentinel-2-L2A collection definition JSON on the Microsoft Planetary Computer. Based on this, and the sample item, I created a version (in my opinion) compatible with the L1C level of Sentinel-2. Here they are:
https://s3.waw3-2.cloudferro.com/swift/v1/stac/collection.json https://s3.waw3-2.cloudferro.com/swift/v1/stac/s2l1c_item.json
I managed to upload it to pgSTAC using:
Then I checked the content of collections table: https://s3.waw3-2.cloudferro.com/swift/v1/stac/collection_base_item.json https://s3.waw3-2.cloudferro.com/swift/v1/stac/collection_content.json
and items table: https://s3.waw3-2.cloudferro.com/swift/v1/stac/item_content.json
As you can see, in the items table, the column labeled "content" contains the "πβ»" sign:
even though the content of the title was defined within the base item in collections table.
Then I tried to validate my initial JSONs using pystac validate:
but i can't quite get the meaning of the error message.
Could you please guide me on what is wrong with the item definition JSON and how to correctly create a corresponding collection with items?