rpbouman / huey

A UI for DuckDB
MIT License
182 stars 14 forks source link

Feature: multi-measure metrics (correlation and friends) #105

Open rpbouman opened 1 month ago

rpbouman commented 1 month ago

There is a collection of aggregate function that take two metrics, like correlation, covariation, regression, etc. These would apply to numeric arguments.

The idea to present these in the UI would be: each numerical column would get a folder called "correlation & regression". This could either be in the statistics folder, or at the same level as the statistics folder.

Inside the correlation and regression folder, we would have a folder for each multi-metric aggregate function we want to support. That folder would contain as many instances of the aggregate folder as we have other numerical columns, and they represent that aggregate function, passing the main column and the other column.

For example, we have a dataset of persons and numerical columns weight and height. We would then have a correlation folder for weight, containing one item called height, and that would calculate corr(weight, height). Vice versa, the height column would also have a correlation folder and it would contain the item corr(height, weight).