stac-utils / pgstac

Schema, functions and a python library for storing and accessing STAC collections and items in PostgreSQL
MIT License
151 stars 36 forks source link

Include performance benchmarking as part of integration testing. #118

Open sharkinsspatial opened 2 years ago

sharkinsspatial commented 2 years ago

@philvarner has released https://github.com/stac-utils/stac-api-benchmark to allow benchmarking query consistency and performance across STAC API implementations. New PRs should probably run these benchmarks as part of the integration testing pipeline and compare results against previous branches to identify consistency or performance regresssions.

philvarner commented 2 years ago

I'll add a quiet mode today that only outputs a final JSON results file.

Also, the "random queries" test requires (I think requires?) three queryables that take a value 0-100, and I need to make this a bit more flexible.

philvarner commented 2 years ago

Does this look reasonable for out output format? (the numbers are seconds of runtime, they're really low b/c I only ran a few for each benchmark)

{
  "step": 0.4397075420129113,
  "tnc": 0.4970361669547856,
  "countries_apr_2019": 0.3382589580141939,
  "countries_cloud_cover_asc": 0.39972841599956155,
  "random_queries": 47.06457624997711,
  "repeated": 17.237872541998513,
  "sort_cloud_cover_desc": [
    {
      "sort": "sentinel-2-l2a_properties.eo:cloud_cover_desc",
      "duration": 0.38575045799370855
    }
  ],
  "sort_cloud_cover_asc": [
    {
      "sort": "sentinel-2-l2a_properties.eo:cloud_cover_asc",
      "duration": 0.3100657499744557
    }
  ],
  "sort_datetime_desc": [
    {
      "sort": "sentinel-2-l2a_properties.datetime_desc",
      "duration": 0.054874249966815114
    }
  ],
  "sort_datetime_asc": [
    {
      "sort": "sentinel-2-l2a_properties.datetime_asc",
      "duration": 0.34525220800423995
    }
  ],
  "sort_created_desc": [
    {
      "sort": "sentinel-2-l2a_properties.created_desc",
      "duration": 0.15877308399649337
    }
  ],
  "sort_created_asc": [
    {
      "sort": "sentinel-2-l2a_properties.created_asc",
      "duration": 0.05182845803210512
    }
  ]
}
sharkinsspatial commented 2 years ago

👍 @philvarner

philvarner commented 2 years ago

merged to main. Best way to run it is probably to set --verbosity ERROR so any of that usual output from that doesn't interfere with the results.

vincentsarago commented 2 years ago

+1 on performance testing!

https://github.com/stac-utils/stac-api-benchmark to allow benchmarking query consistency and performance across STAC API implementations

To me it seems that this is a benchmark that should be run in each stac-fastapi backend, which as far as I understand might not always be up to date with pgstac.

I'll try to start something using maybe pytest-benchmark to tests the SQL methods and then use maybe https://github.com/benchmark-action/github-action-benchmark to make sure we get a report

vincentsarago commented 2 years ago

I've started a really quick demo over https://github.com/vincentsarago/pgstac-benchmark

I wonder now what are the feature we want to benchmark?