stac-utils / pystac

Python library for working with any SpatioTemporal Asset Catalog (STAC)
https://pystac.readthedocs.io
Other
357 stars 119 forks source link

Processing extension can't be used for validation? #845

Closed m-mohr closed 2 years ago

m-mohr commented 2 years ago

When I run validation for items or collections with the processing extension being added to the JSON files, I get an error that says it can't resolve a JSON pointer. I'm on PySTAC 1.4.0.

I'm repoprting this here, as stac-node-validator doesn't have this issue so I assume the JSON Schema is correct (also, it looks correct to me).

Error: jsonschema.exceptions.RefResolutionError: Unresolvable JSON pointer: 'definitions/require_any_field'

Example JSON:

{
  "type": "Collection",
  "id": "goes-glm",
  "stac_version": "1.0.0",
  "description": "The Lightning Detections: Events, Groups, and Flashes product consists of a hierarchy of earth-located lightning radiant energy measures including events, groups, and flashes. Lightning events are detected by the instrument. Lightning groups are a collection of one or more lightning events that satisfy temporal and spatial coincidence thresholds. Similarly, lightning flashes are a collection of one or more lightning groups that satisfy temporal and spatial coincidence thresholds. The product includes the relationship among lightning events, groups, and flashes, and the area coverage of lightning groups and flashes. The product also includes processing and data quality metadata, and satellite state and location information.",
  "links": [
    {
      "rel": "item",
      "href": "./item.json",
      "type": "application/geo+json",
      "title": "OR_GLM-L2-LCFA_G16_s20203662359400_e20210010000004_c20210010000030"
    },
    {
      "rel": "root",
      "href": "./collection.json",
      "type": "application/json",
      "title": "GLM L2 Lightning Detections: Events, Groups, and Flashes"
    },
    {
      "rel": "license",
      "href": "https://www.ncei.noaa.gov/access/metadata/landing-page/bin/iso?id=gov.noaa.ncdc:C01527",
      "title": "License"
    },
    {
      "rel": "about",
      "href": "https://www.ncei.noaa.gov/access/metadata/landing-page/bin/iso?id=gov.noaa.ncdc:C01527",
      "type": "text/html",
      "title": "Product Landing Page"
    },
    {
      "rel": "about",
      "href": "https://www.goes-r.gov/users/docs/PUG-main-vol1.pdf",
      "type": "application/pdf",
      "title": "Product Definition and Users' Guide (PUG) Vol.1 Main"
    },
    {
      "rel": "about",
      "href": "https://www.goes-r.gov/products/docs/PUG-L2+-vol5.pdf",
      "type": "application/pdf",
      "title": "Product Definition and Users' Guide (PUG) Vol.5 Level 2+ Products"
    },
    {
      "rel": "cite-as",
      "href": "https://doi.org/10.7289/V5KH0KK6"
    },
    {
      "rel": "self",
      "href": "https://raw.githubusercontent.com/stactools-packages/goes-glm/main/examples/collection.json",
      "type": "application/json"
    }
  ],
  "stac_extensions": [
    "https://stac-extensions.github.io/processing/v1.1.0/schema.json",
    "https://stac-extensions.github.io/scientific/v1.0.0/schema.json",
    "https://stac-extensions.github.io/table/v1.2.0/schema.json",
    "https://stac-extensions.github.io/item-assets/v1.0.0/schema.json"
  ],
  "sci:doi": "10.7289/V5KH0KK6",
  "sci:citation": "GOES-R Algorithm Working Group and GOES-R Series Program, (2018): NOAA GOES-R Series Geostationary Lightning Mapper (GLM) Level 2 Lightning Detection: Events, Groups, and Flashes. [indicate subset used].NOAA National Centers for Environmental Information. doi:10.7289/V5KH0KK6. [access date].",
  "item_assets": {
    "geoparquet_events": {
      "title": "Processed GeoParquet file for events",
      "type": "application/x-parquet",
      "roles": [
        "data",
        "cloud-optimized"
      ],
      "table:primary_geometry": "geometry"
    },
    "geoparquet_flashes": {
      "title": "Processed GeoParquet file for flashes",
      "type": "application/x-parquet",
      "roles": [
        "data",
        "cloud-optimized"
      ],
      "table:primary_geometry": "geometry"
    },
    "geoparquet_groups": {
      "title": "Processed GeoParquet file for groups",
      "type": "application/x-parquet",
      "roles": [
        "data",
        "cloud-optimized"
      ],
      "table:primary_geometry": "geometry"
    },
    "netcdf": {
      "title": "Original netCDF 4 file",
      "type": "application/netcdf",
      "roles": [
        "data",
        "source"
      ]
    }
  },
  "title": "GLM L2 Lightning Detections: Events, Groups, and Flashes",
  "extent": {
    "spatial": {
      "bbox": [
        [
          -141.56,
          -66.56,
          -8.44,
          66.56
        ]
      ]
    },
    "temporal": {
      "interval": [
        [
          "2022-07-01T00:00:00Z",
          null
        ]
      ]
    }
  },
  "license": "proprietary",
  "keywords": [
    "NOAA",
    "GOES",
    "GOES-16",
    "GOES-17",
    "GLM",
    "Atmosphere",
    "Environmental",
    "Lightning",
    "Weather",
    "netCDF",
    "GeoParquet"
  ],
  "providers": [
    {
      "name": "DOC/NOAA/NESDIS",
      "description": "Provided by:\n\n* U.S. Department of Commerce\n* National Oceanic and Atmospheric Administration\n* National Environmental Satellite, Data, and Information Services",
      "roles": [
        "producer",
        "licensor"
      ],
      "url": "https://www.goes.noaa.gov"
    }
  ],
  "summaries": {
    "mission": [
      "GOES"
    ],
    "constellation": [
      "GOES"
    ],
    "platform": [
      "GOES-16",
      "GOES-17"
    ],
    "instruments": [
      "FM1",
      "FM2"
    ],
    "gsd": [
      8000
    ],
    "processing:level": [
      "L2"
    ],
    "goes:image-type": [
      "FULL DISK"
    ],
    "goes:orbital-slot": [
      "West",
      "East"
    ]
  }
}
duckontheweb commented 2 years ago

@m-mohr I'm not able to reproduce this on PySTAC v1.4.0 and jsonschema v4.6.0. Could you post your jsonschema version and an example script that shows how you are running the validation?

Here is what I get with the example above saved at ./test-validation-error.json:

>>> from pystac.validation import validate_dict
>>> import json
>>> with open("./test-validation-error.json") as src:
...     coll_dict = json.load(src)
... 
>>> validate_dict(coll_dict)
['https://schemas.stacspec.org/v1.0.0/collection-spec/json-schema/collection.json', 'https://stac-extensions.github.io/scientific/v1.0.0/schema.json', 'https://stac-extensions.github.io/processing/v1.1.0/schema.json', 'https://stac-extensions.github.io/table/v1.2.0/schema.json', 'https://stac-extensions.github.io/item-assets/v1.0.0/schema.json']
m-mohr commented 2 years ago

Ah, good call. I was on jsonschema 3.2.0 and updating it to the latest version fixed it. Not sure how that happened. Sorry for the noise.

duckontheweb commented 2 years ago

We currently require jsonschema >= 3.0.0, so the version you had installed was valid. I’m going to reopen this so we can change that dependency to be sure we avoid this issue.

m-mohr commented 2 years ago

Ah, I thought I had seen a 4.x requirement somewhere, but then it's indeed a good idea to update.