alecramsay / T

Tables as a computing model
MIT License
0 stars 0 forks source link

Column properties #13

Open alecramsay opened 2 years ago

alecramsay commented 2 years ago

With that context, rotating 90 degrees, one can imagine (derived) column properties: think of the aggregates -- min, max, sum, count, avg probably extended to include med(ian) and standard deviation (std_dev).

We could compute these automatically and allow you to reference them in expressions. This would be like automatically doing pgm.aggregate() with no by column specified, so you'd be getting the statistics for the column values for the entire table (at that point in the pipeline).

So, you wouldn't have to do the gymnastics to do an explicit aggregate and then reference those results in the processing of the base table. (We haven't figured out/decided how to reference another table yet, but you wouldn't have to for this scenario.) You could, of course, still do an explicit pgm.aggregate() if you want to materialize that table.

Since all that info is completely derived, it doesn't seem to me to be a problem to allow it to be referenced w/o having explicitly compute it.

This is a bit more of a stretch, but I think we could go even farther. Pivots (aggregate by's) are, by definition, completely derived from the table. So, one could image that there's a built in sub-language for referring to pivot values in expressions, again w/o having to explicitly compute the aggregates, squirrel them away, and then reference them in the processing of the base table. I'm not what that language should look like, but using my "district statistics from precincts" example the hierarchy is:

District / col / statistics function

where 'District' is the by aggregate/pivot by value, is the name of a numeric column in the table, and is one of the min, max, etc. functions enumerated above. It might be useful & interesting to be able to, again, reference these values in expressions w/o having to explicitly compute them, store them somewhere, and then reference them.

alecramsay commented 1 year ago

Does inspect() address this?