Open louisstuart96 opened 1 month ago
The double negative nohydrate=false
is a bit cumbersome to reason through here, but I think the default is logical. When "use api" is turned off then nohydrate==false
, which means the DB will perform the hydration. If "use api" is turned on, nohydrate==true
, and the database will skip the hydration step and it must be performed on the API side.
use_api_hydrate |
nohydrate |
Hydration performed in |
---|---|---|
false | false | database |
true | true | API |
Whether or not the default is right for your setup is a bit subjective, though. In our experience, the option should target where you have the most spare compute. In the Planetary Computer, which has a single, large pgstac database server instance, using DB Hydration resulted in high CPU usage there, slowing down queries across the board. We also had a fairly large API cluster though, so we were able to spread out that CPU load across the various nodes and the DB could remain responsive. If you have more compute in your DB server than in your API instances, it may be better to keep it the DB.
Our team is testing eoapi (on top of stac-fastapi-pgstac) against STAC items with lots of asset links. We faced performance problems in search request with 'query' or 'filter' extensions. Our assumption is that application's hydration setting causes this problem.
https://github.com/stac-utils/stac-fastapi-pgstac/blob/a81e4d76abd2e460882a55b78ac4b2c7e34ff510/stac_fastapi/pgstac/core.py#L164
Here, the app's default setting is
use_api_hydrate = False
, which in turn becomesnohydrate=false
in PgSTAC query. However, the correct setting should benohydrate=true
:https://stac-utils.github.io/pgstac/pgstac/#runtime-configurations