Loosely inspired by https://maxhalford.github.io/blog/ogd-in-sql/, especially the last paragraph. If we supported a limited set of PARTITION BY clauses in changefeed expressions that were always intervals over the primary key (or otherwise able to be guaranteed to be within a single changefeed processor), users could write arbitrary streaming calculations over events, using bounded resources, using the Postgres OVER ... PARTITION BY syntax and semantics. That in turn lets you write changefeeds on things like "unusual values in column X, given column X values in other recently seen rows" or even "values in column Y that are surprising given a multivariate regression on the other columns in the table". Or duplicate values, etc.
(Could also be extended to be more of a global map-reduce using the job record to store state but that's harder).
Loosely inspired by https://maxhalford.github.io/blog/ogd-in-sql/, especially the last paragraph. If we supported a limited set of PARTITION BY clauses in changefeed expressions that were always intervals over the primary key (or otherwise able to be guaranteed to be within a single changefeed processor), users could write arbitrary streaming calculations over events, using bounded resources, using the Postgres OVER ... PARTITION BY syntax and semantics. That in turn lets you write changefeeds on things like "unusual values in column X, given column X values in other recently seen rows" or even "values in column Y that are surprising given a multivariate regression on the other columns in the table". Or duplicate values, etc.
(Could also be extended to be more of a global map-reduce using the job record to store state but that's harder).
Jira issue: CRDB-25143
Epic CRDB-21713