Closed zaneselvans closed 2 years ago
This doesn't appear to be a way we can use the parameters
-- they seem to be able only to select a single file path at a time. To pass the DNF filters through to Dask/Pandas we won't be able to constrain the allowable values. See this comment and this example
Closing this as it doesn't seem to be workable.
The EPA CEMS dataset is composed of ~1300 row groups, each containing a unique combination of
year
andstate
to allow efficient pushdown filtering by time and location. Only a certain range of years (1995-2020) and set of state abbreviations (continental US plus DC) are valid for filtering. It would be nice if we could at least suggest, and preferably require that users only attempt to filter with valid values, so that if they ask for something outside of the allowable values they get an error, rather than waiting a long time for a query that won't give them anything useful.Is this easy to set up with the intake catalog? Can we designate an allowable set of values for years and states to be used as filters? How are user
parameters
meant to be used? I've seen that you can enumerate allowable values there, but they seem only to be for use in Jinja templating of the filenames, and not for things like the filters.