erikrikarddaniel / pfitmap

1 stars 0 forks source link

Group by/sum engine #55

Closed erikrikarddaniel closed 10 years ago

erikrikarddaniel commented 11 years ago

Construct a method that delivers a summed matrix to the client. Think RESTful with a natural API that allows reuse. Maybe something like /pfitmap.org/protein_counts?enzyme=RNR I&taxons=Bacteria,Archaea (also allow 'taxa' instead of 'taxons' which is strictly speaking more correct).

From the select (see also "Make sure genome count is correct" issue) construct a CountMatrix object, see DIA file: doc/count_matrix.dia.

erikrikarddaniel commented 10 years ago

Try to create a join model: Protein-Taxon-ProteinCount: TaxonProteinCount, based on join view.

A group by sum could look like: 'TaxonProteinCount.select("domain as domain,protein_class as protein_class,sum(n_genomes) as n_genomes, sum(n_proteins) as n_proteins").group("domain, protein_class")'

Send the array of objects as json to client.

erikrikarddaniel commented 10 years ago

Make sure this is RESTful so that there's a URL syntax that lets a user fetch aggregated data for a selection of taxa and proteins. The RESTful API should be called by the D3 client as well to make sure all calls are through the same API.

binnisb commented 10 years ago

You can test filters and levels with: http://127.0.0.1:3000/count_matrix?taxon_level=domain&protein_level=protfamily&release=0.1&taxon_filter=Bacteria,Viruses&protein_filter=NrdB-R2lox,NrdG-PFLa

Then you can change domain to whatever, just change the taxon_filter accordingly and the same for protein_level and protein filter.

I'm going to open another issue to address that if a taxon does not have a specific protein level then it is just left out now. so if you do protein level subclass you get everything empty (shold be zero right?)

binnisb commented 10 years ago

Or maybe we need to think about this later when we look at adding the enzymes as well.