dkubb / axiom

Simplifies querying of structured data using relational algebra
https://github.com/dkubb/axiom
MIT License
459 stars 22 forks source link

How can I produce COUNT (*) queries? #59

Open snusnu opened 9 years ago

snusnu commented 9 years ago

I'm trying to implement pagination, and for UX (pagination links) I need to know the total count of tuples inside the paginated relation. The current API for Relation#summarize, and adding a :count column specifically, forces me to know an existing attribute name present in the relation. Ideally, that wouldn't be necessary tho, because it leaves the burden of knowing relation internals to the callsite (in order to paginate, I need to know one (arbitrary?) NOT NULL attribute).

I'm currently pondering, whether I should simply accept having to know one attribute name (thus including it as a param in the API for pagination) or whether I should do something "clever" to get at one arbitrary name. Something like the following (which imo really is ugly):

def count
  relation.summarize { |r|
    r.add(:count, r.send(relation.header.keys.to_a[0].to_a[0].name).count)
  }.sort.one[:count]
end

Both options seem less than ideal to me. It seems like what i really want, is COUNT (*), but nowhere in axiom-sql-generator specs can I find examples of axiom actually being able to produce that.

Any pointers would be very much appreciated! Thx in advance!

dkubb commented 9 years ago

@snusnu the primitives are in place, but I agree that it might make sense to have convenience methods for the common cases like #count and the other aggregate operations.

As far as being able to produce COUNT(*) I'll have to look and see if there is an equivalent in relational algebra, or more specifically Tutorial D, which is what axiom is based on.

snusnu commented 9 years ago

@dkubb looking at python's DEE docs, i see nothing related to COUNT(*). Seems like they necessitate knowing an attribute too.

I haven't looked at Tutorial D itself tho, they might have something to say about it ...

In the meantime, I'm leaning towards not having to know an attribute name to do pagination. I'll keep on using the code pasted above, to get at one existing, NOT NULL attribute.

snusnu commented 9 years ago

@dkubb seems like I've misread the DEE docs for COUNT. They do not need an attribute name, but only the relation to count, as argument: http://www.quicksort.co.uk/DeeDoc.html#count-len

blambeau commented 9 years ago

@snusnu FYI Tutorial D simply has aggregate operators that accept relations a first argument. For instance COUNT(...) will simply return the number of tuples of the input relation (that can, of course, be any relation expression). SUM(..., ATTRNAME) is similar, and so on.

SUMMARIZE can be shown as being a shorthand for a longer expression relying on EXTEND, relation-valued attributes (RVAs) and those aggregate operators. Such shorthand does use the same aggregate operators, but their first argument becomes implicit.