Open sam0x17 opened 4 years ago
This is the current monkey patch I am using. Note I had to change the base Value(Model, Scalar)
class to accept nil
as a default value.
module Granite::Query::Executor
class Value(Model, Scalar)
include Shared
def initialize(@sql : String, @args = [] of Granite::Columns::Type, @default : Scalar = nil)
end
def run : Scalar
log @sql, @args
# db.scalar raises when a query returns 0 results, so I'm using query_one?
# https://github.com/crystal-lang/crystal-db/blob/7d30e9f50e478cb6404d16d2ce91e639b6f9c476/src/db/statement.cr#L18
# raise "No default provided" if @default.nil?
Model.adapter.open do |db|
db.query_one?(@sql, args: @args, as: Scalar) || @default #.not_nil!
end
end
delegate :<, :>, :<=, :>=, to: :run
delegate :to_i, :to_s, :to_t, to: :run
end
end
macro define_granite_aggregate(name, aggregate, type_union, inner_type_union, default=nil, resolver=nil)
abstract class Granite::Query::Assembler::Base(Model)
def {{name.id}}(column : String | Symbol) : Executor::Value(Model, {{inner_type_union}})
sql = build_sql do |s|
s << "SELECT {{aggregate.id}}(#{column})"
s << "FROM #{table_name}"
s << where
s << group_by
s << order(use_default_order: false)
s << limit
s << offset
end
#if group_by
# Executor::MultiValue(Model, {{type_union}}).new sql, numbered_parameters, default: {{default}}
#else
Executor::Value(Model, {{inner_type_union}}).new sql, numbered_parameters, default: {{default}}
#end
end
end
abstract class Granite::Query::Builder(Model)
def {{name.id}}(column : Symbol | String, default : Time | Int64 | Int32 | Float64 | Nil = nil) : {{type_union}}
val = assembler.{{name.id}}(column).run
return default if val.nil?
{% if resolver %}
val = val.{{resolver.id}}
{% end %}
val
end
end
end
define_granite_aggregate(:min_t, :MIN, Time?, Time?)
define_granite_aggregate(:max_t, :MAX, Time?, Time?)
define_granite_aggregate(:sum, :SUM, Int64, Int64 | Int32, 0_i64, :to_i64)
define_granite_aggregate(:min, :MIN, Int64?, Int64 | Int32 | Nil, nil, :to_i64)
define_granite_aggregate(:max, :MAX, Int64?, Int64 | Int32 | Nil, nil, :to_i64)
define_granite_aggregate(:avg, :AVG, Float64?, Float64 | PG::Numeric | Nil, nil, :to_f64)
define_granite_aggregate(:sum_f, :SUM, Float64, Float64 | PG::Numeric, 0_f64, :to_f64)
define_granite_aggregate(:min_f, :MIN, Float64?, Float64 | PG::Numeric | Nil, nil, :to_f64)
define_granite_aggregate(:max_f, :MAX, Float64?, Float64 | PG::Numeric | Nil, nil, :to_f64)
note that this could be extended to work with group by, and that I'm using PG::Numeric
when I should be using something more generic
This may be a situation where the different drivers have different offerings. For example, I know the way that postgres does group by statements is different than the way mysql does it, and that affects the way the select is built.
I like the idea of adding these more advanced aggregates, but I think that historically Granite has been careful about maintaining the top level DSL to be something uniform across the drivers. If people want the full flexibility of their database they have to drop down to a lower level.
Yeah on that note I've been thinking of taking my ideas and making an ORM
Hey Sam just a thought - what about a higher level ORM that delegates all but the aggregate methods to one of the supported ORMs ? Not sure how that would be constructed at the moment but the thought just struck me that a higher level ORM that provided all kinds of additional functionality, like these aggregate methods for example, on top of the existing ORMs might be a good idea.
The way I envisioned people using Granite to handle aggregate queries is to build a new read only model for each query. Maybe this is too laborious of a solution.
class UserAgeByCountry < Granite::Base
select "SELECT country, count(age), min(age), max(age), avg(age), sum(age) FROM users GROUP BY country"
column country : String
column cnt_age : Int64
column min_age : Int64
column max_age : Int64
column avg_age : Int64
column sum_age : Int64
end
what if we create new column types for each aggregate that takes a type <T>
?
class UserAgeByCountry < Granite::Base
table :users
column country : String, group_by: true
column cnt_age : Count(:age, Int64)
column min_age : Min(:age, Int64)
column max_age : Max(:age, Int64)
column avg_age : Avg(:age, Int64)
column sum_age : Sum(:age, Int64)
end
@drujensen I noticed this when I went looking around for how to do this. For my purposes this would be super impractical as there are literally hundreds of places I need to use aggregate methods (e.g. for the stats page in an admin control panel), and usually I am using complex where/scope chains, and then slapping an aggregate onto the end of those. If I had to create custom view models for each of these scoping situations, I would have 10-20 of them for each model.
Thanks @Blacksmoke16 and @sam0x17 for the feedback. I am torn if an ORM should return random query results or if that should be a separate product since its no longer mapping results to an Object. What this PR is doing is is creating a DSL for SQL and expecting an array of named tuples with no type checking, etc. It's not leveraging the mapping at all. However, I realize this is a much needed feature and see other ORM's doing similar solutions.
If anything you could do something like (untested):
class User < Granite::Base
...
def self.max_age : Int32
self.scalar("SELECT MAX(age) FROM #{self.table_name}")
end
end
Or really could generalize it, then you have context which would help with how to set things up.
@Blacksmoke16 you wouldn't be able to append a where
, group_by
, limit
, order_by
to that.
The Lucky Framework separates the table mapping and the query into two separate objects. Maybe that is something to consider architecturally instead of combining the two concerns into the same object. wdyt?
Could this problem be solved with Virtual Models ? I am not sure if this would work or not but if it did it could be very useful. For instance User.all gives you an array of User records and in the view you can reference a property of the model such as user.email. However if you use #map to modify the data sent to the web client you get an array of NamedTuples and user.email gives 'undefined method'. So what would be cool would be virtual models that provide a view on the underlying model with a different field set (and thus property methods) and also additional methods such as aggregators. Any method not defined in the virtual model would delegate to the underlying model.
I don't want to have virtual model classes, but be able to create them in a single line of code e.g.
users = User.all.map{ |user| VirtualModel(User.class, properties: [name : String, email : String, created_at : DateTime]).new }
then users.to_json only gives name, email and created_at in the result and this works for both json results and also for ecr/slang templates.
For aggregation how about
stats = VirtualModel.all(User.class, properties: [ country : String,
cnt_age : Int64,
min_age : Int64,
max_age : Int64,
avg_age : Int64,
sum_age : Int64
],
query: "SELECT country, count(age) cnt_age, min(age) min_age, max(age) max_age, avg(age) avg_age, sum(age) sum_age FROM users GROUP BY country"
)
@damianham I think what your proposing is pretty much the same as what I proposed, correct? Curious why you wouldn't want to declare it as a class instead. You can leverage the query builder if you use the existing model that sits on top of your query instead.
@drujensen side note: is it possible to apply a .scalar
like what @Blacksmoke16 used above but to a where chain? Because if so my life becomes a bit easier
@damianham I think what your proposing is pretty much the same as what I proposed, correct? Curious why you wouldn't want to declare it as a class instead. You can leverage the query builder if you use the existing model that sits on top of your query instead.
I was just throwing the idea around to see what your thoughts were. I think it is similar to what you proposed except you don't populate your project with lots of classes, instead there is a single virtual model class that is part of the framework and you instantiate virtual instances inside your controllers.
I did something similar to this a long time ago in RoR where I used method_missing to get fields from an attribute hash that was populated with columns from the result set record.
Thanks @damianham I understand. So your idea is to have more of a smart object that can dynamically generate the properties. Ruby is good at that stuff but this might be tough to pull off using macros. I personally prefer having well defined objects for queries, but I see how you may just want to grab the max(:age)
without creating a whole class and cluttering the code base.
@sam0x17 There is an executor that runs for scalar values: https://github.com/amberframework/granite/blob/0db7835eece5edd143c5b549069e19de68789ba5/src/granite/query/executors/value.cr#L8 I don't recall when this executor is invoked vs the list executor.
The biggest difficulty is if you have a complex scope/query chain, and you need to grab a max or an avg off of that.
So here's a real scenario where not having these makes things extremely difficult:
Let's say I have a User
model with an Int32
field called upvotes
, a country
field, a state
field, and a city
field, and I want to know the average number of upvotes for users in a particular city, state, and country, created in January...
class User < Granite::Base
connection pg
table users
column id : Int64, primary: true
column email : String
column password_hash : String
column upvotes : Int32
column city : String?
column state : String?
column country : String?
timestamps
end
I would expect to just be able to do something like:
User.where(:created_at, :gt, Time.utc - 4.months)
.where(:created_at, :lt, Time.utc - 3.months)
.where(country: "US")
.where(state: "MD")
.where(city: "Baltimore")
.avg(:upvotes)
This is an incredibly common sort of construction for people building web apps, and in particular for people coming from Rails where it is easy to chain aggregates onto complex where chains.
The current proposed solution is to make a custom version of the model (sort of like a materialized "view") that houses a custom select clause. Something like UsersBetweenDateInSpecificLocationView
....... This would be sort of OK if I could just have that custom view inherit from my base granite model, but inheritance doesn't work in Granite, so I have to completely recreate the model with all the fields involved in the query just to get the answer to some aggregate.
One solution that would work would be the ability to have multiple custom select clauses per model, like allow a name to be passed along with the macro, so you get named custom select scopes, though I'd still prefer to just have working chainable aggregates
I have a monkey patch I've been using locally for this for a while. The tricky part is handling types correctly, as
Executor::Value
andExecutor::MultiValue
are tricky to work with. In particular, this has to work with timestamps as well as with numeric columns.The different aggregate functions have some special caveats when it comes to types:
avg
can return a float when the column is an int typeavg
can returnnil
when there are no records that match the querymin
andmax
can returnnil
when there are no records that match the querysum
defaults to 0 in the appropriate type when there are no recordsmin
, andmax
can apply to timestamps as well as float and int columnsInt32
vsInt64
andFloat32
vsFloat64
create some issues -- I try to stick with the larger type to be safe -- we might have to be more specific with this to handle things likeBIGINT
Because of all the type constraints, this may have to have type-specific versions, e.g.
min
for ints,min_f
for floats,min_t
for time, etc., and I'd be fine with that.In Rails these are called
minimum
/maximum
/average
/sum
.My expectation would be the ability to do things like:
Widget.where(:created_at, :gt, Time.now - 10.days).avg(:units_sold) # => Float64?
Widget.min(:created_at) # => Time?
Product.where(cool: true).sum(:coolness) # => Int64
(defaults to 0)Comment.where(deleted_at: nil).max(:downvotes) => Int64?
Possibly we would provide a second argument allowing one to specify a default value, e.g.:
User.minimum(:created_at, Time.utc) # => Time
(defaults toTime.utc
if no records found)I'm happy to submit a PR for this, just wanted to collect some feedback first.