feathersjs / feathers

The API and real-time application framework
https://feathersjs.com
MIT License
15.08k stars 752 forks source link

Implementing "group by" #677

Closed FredericLatour closed 7 years ago

FredericLatour commented 7 years ago

In case I would like to use sequelize "group" clause, do I need to use a hook? Is that the proper approach?

daffl commented 7 years ago

You can add params.sequelize.group, e.g. in a hook

app.servivce('dbservice').find({
  sequelize: { group: 'name' }
});

See params.sequelize documentation, Sequelize groupBy documentation.

FredericLatour commented 7 years ago

Unfortunately, this doesn't really make much sense to me. Certainly because I'm just scratching the surface of your tool. If I would decide to group data, I would do it at the query level. What kind of magic would this this sequelize object bring so that it would let me use "group" instructions within a query and moreover let seqelize know how to handle it?

daffl commented 7 years ago

params.sequelize is what is being passed (with additional parameters from the adapter) to Model.findAll.

Even if it doesn't make sense to you, did what I posted work when you tried it? You can also modify it in a hook by setting hook.params.sequelize = { group: 'groupcolumn' } as shown in the adapter documentation.

FredericLatour commented 7 years ago

Nothing wrong with trying something that I don't understand (it happens to me a lot), however I don't even know where I should put this. Basically, I used the generator for the app, used the genrator for a service, updated the model generated file. Now, where should I put your magic?

daffl commented 7 years ago

How Feathers hooks work and basic Sequelize usage is pretty essential here. Eventually I do recommend to go through the basics guide and learn how to get started with plain Sequelize.

For this time though, run feathers generate hook, call it groupBy, choose it to be a before hook, select the service(s) you want to add it to and find as the method:

screen shot 2017-09-18 at 4 33 42 pm

Then change src/hooks/group-by.js to

// Use this hook to manipulate incoming or outgoing data.
// For more information on hooks see: http://docs.feathersjs.com/api/hooks.html

module.exports = function (options = {}) { // eslint-disable-line no-unused-vars
  return function groupBy (hook) {
    hook.params.sequelize = { group: 'name' };
  };
};
FredericLatour commented 7 years ago

You are right, I don't know much about sequelize. I'm not a big fan of ORM in general. My experience is that things tends to get messy because you always have to work around ORM limitations. That said I can understand how an ORM makes perfect sense in the case of feathersjs. I get an overall understanding of feathersjs as well as how hooks work and what they are for. However, I still I have a hard time to understand how setting 'params.sequelize' to { group: 'name' } will suddenly unlock sequelize engine so that it can handle "group" instructions that are part of a query... It will certainly click at one time or another ...

I'm going to try your hook for now

carcinocron commented 7 years ago

@FredericLatour have you checked out knex.js? It's a great query builder, and if you needed an ORM there are still some ORMs built on top of knex.js.

FredericLatour commented 7 years ago

@daffl Sorry I couldn't provide any feedback earlier. I'm testing this outside my work. I get immediately an error regarding a normal column in the group statement. Maybe some misunderstanding. let me possibly re-phrase what I would like to do:

What I'm trying to do is to be able to have the opportunity to issue "group" clause in my queries. The ideal solution would be that "groupBy" would be part of the common query feature, but I imagine that this kind of operation is not supported by some databases you are interfacing. Now because you are interfacing to sequelize and because sequelize implement a group feature (well kind of - the feature is implemented by the database in the end), the most obvious solution would be be able to pass additional parameters that will be passed along to sequelize in order to properly grouping data.

From the error I get, I guess that your solution is asking sequelize to group on 'normal' column from within the before hook. But that's not really what I want to do. I don't always want to group the data and I don't want to group the data on always the same column. I want to pass the grouping options at query time ...

Is that something possible?

Thanks

FredericLatour commented 7 years ago

@InstanceOfMichael You are right, I could certainly use knex adaptater instead of Sequelize but that would not change fundamentally my problem. I need something dynamic at the query level, but all I could find is setting fixed additional query options at the service level using a hook.

daffl commented 7 years ago

Hooks are run per request and also contain the query in hook.params.query. That allows you to pretty much implement any custom logic, in your case

// Use this hook to manipulate incoming or outgoing data.
// For more information on hooks see: http://docs.feathersjs.com/api/hooks.html

module.exports = function (options = {}) { // eslint-disable-line no-unused-vars
  return function groupBy (hook) {
    const groupBy = hook.params.query.$group;

    if(groupBy) {
      hook.params.sequelize = { group: groupBy };
      delete hook.params.query.$group;
    }
  };
};

Again, I recommend reading through the guides and then getting familiar with the API documentation for hooks.

FredericLatour commented 7 years ago

@daffl I couldn't find an example where I can see dynamic query manipulation. If you can point me to some link, that would be appreciated.

So, If I understand correctly:

In order to fill in the gap of your sample code, I suppose that I need to parse $group which must be the exact same object I passed in the query. Say I just put a field, I would have something like this: hook.params.sequelize = { group: groupBy };

Am I correct?

Aside questionq:


I wonder how is that that there as so few questions about grouping data. We have to display statistics information sometimes. Seems not trivial at all. What's your recommendation on this ? Do you believe it should not be in the scope of feathers?

Thanks

FredericLatour commented 7 years ago

@pedrogk No, I didn't dig further. It seemed overly complex to get something that simple and common. I just don't get that with the hundreds of page of documentation there is not a single example of an aggregation or at least a clear explanation on how to pass through a sql query (if that's possible). But maybe that's me. In the end I thought that if I need to invest hours and hours each time I don't have a plain simple query, I'd better inverst this time into GraphQL.

daffl commented 7 years ago

I don't think I understand what the problem is here. I showed how to customize the query and just verified again that it works.

The rest is learning about Sequelize which has plenty of examples. For example this, or this can be turned into a before find hook for a Sequelize service like this:

// Use this hook to manipulate incoming or outgoing data.
// For more information on hooks see: http://docs.feathersjs.com/api/hooks.html
const sequelize = require('sequelize');

module.exports = function (options = {}) { // eslint-disable-line no-unused-vars
  return function groupBy (hook) {
    const groupBy = hook.params.query.$group;

    if(groupBy) {
      const seq = {
        attributes: [
          groupBy,
          [sequelize.fn('count', sequelize.col(groupBy)), 'count']
        ],
        group: [ groupBy ]
      };

      hook.params.sequelize = seq;
      delete hook.params.query.$group;
    }
  };
};

And it will return a result like this:

{
  "total": [
    {
      "count": 2
    },
    {
      "count": 2
    }
  ],
  "limit": 10,
  "skip": 0,
  "data": [
    {
      "type": "admin",
      "count": 2
    },
    {
      "type": "user",
      "count": 2
    }
  ]
}

You can also always make Sequelize queries directly by accessing app.service('sequelizemodel').Model.

Feathers is just an API connection layer which connects your choice of database/ORM with your frontend of choice. We do have some built in adapters which also allow to customize your queries but for more complex queries there is no way around being familiar with the ORM and database you want to use.

DesignByOnyx commented 7 years ago

@FredericLatour - I have been using feathers sequelize for 2 years and it has never gotten in my way, nor have I ever wished feathers did more for me.

From reading the conversation above, it seems that you don't entirely understand how sequelize and/or feathers works. That's not a jab at you, it's just an observation from having helped hundreds of people the last 2 years - it's easy to spot those who don't quite understand yet.

So forget feathers for a minute - write the code that you would write with just plain sequelize. If you cannot do this, then feathers is not the problem. Once you have your working sequelize code, you might end up with something like this:

ModelA.findAll({
  where: { foo: 'bar', 'bing': { $in: [1, 2, 3] },
  include: [{ 
    model: ModelB,
    attributes: ['name', 'created_at']
  }],
  group: ['some_col']
});

Now that you understand sequelize and know how to build complex queries, then you can create your feathers hook. The http request will look like this:

// GET /modelA-service?foo=bar&bing[$in][0]=1&bing[$in][1]=2&bing[$in][2]=3

...and your hook:


function (hook) {
  if(shouldBuildComplexQuery) {
    console.log(hook.params.query); //-> { foo: 'bar', 'bing': { $in: [1, 2, 3] }
    hook.params.sequelize = {
      include: [{ 
        model: ModelB,
        attributes: ['name', 'created_at']
      }],
      group: ['some_col']
    };
  }
}

Notice how hook.params.sequelize is passed as the options param for the underlying sequelize call (no need to set the where property, feathers does this for you). This is true for all feathers service methods: get, find, create, patch, update, remove. Anywhere you see options in the sequelize docs, you can set those options from a hook. This is clearly documented.

In the situation where the ORM doesn't do what you need, write a custom query by hand. This is trivial to do with sequelize. An ORM is just a tool, not an end-all-be-all.

FredericLatour commented 7 years ago

@daffl The problem is not to go into the ORM language/features, the problem is to understand exactly what I can do and at which stage.

Your example is a very specific case that let me wondering what my approach should be for a generic hook that will let me implement any kind of aggregation.

Say I want to issue something like this:

SELECT SUM(amount1) AS total1, SUM(amount2) AS total2
GROUP BY Year(date_purchase)

Is the proper approach to come up with a specific JSON structure that I can parse at hook stage in order to come up with the proper sequelize query? For instance something like that:

$group = {
   attributes: [
     { 'sum', 'amount1', 'total1'},
     { 'sum', 'amount2', 'total2'}
   ],
  groupBy: [
   { 'day', 'date_purchase'}
  ]
}

Is that the proper approach if I want to come up with a generic hook for issuing aggregate queries over sequelize? Am I missing something? Any wall that I will hit with this approach? Will not mix with feathrersjs own parsing ?

I had a few questions sometimes ago that I came up after some more investigation. Unfortunately I never had an answer. It would have certainly helped clarifying things a bit.

That said, if this is indeed the approach, I have a hard time to understand how is that that such a feature is not available either by the way of an external/optional module that would parse a json group definition (based on an arbitrary schema) into a sequelize query or even out of the box by feathersjs (it seems that some adaptaters can support specific features eg: search).

pedrogk commented 7 years ago

Yes @daffl, It was a mistake from my side, that is why I deleted my comment. I figured it out with the pointers you gave here.

Anyway, here are the steps I used to implement a custom aggregation with feathers-sequelize in case it is useful for somebody else in the future:

  1. In order to trigger your custom aggregation, call a findoperation on the service and add a custom parameter to the query. query: { $whatever: 'col1' }. Note that I use $whatever simply to state that you can name it whatever you want, it is just a custom flag you are adding so that you can detect it in a hook and use the value provided to customize your query, you will remove the flag from the final parameters before actually executing the query.
  2. Create a before hook for the given service on the find operation (either manually or using the cli).
  3. In the hook code, check if your custom query param exists, using something like if (hook.params.query.$group).
  4. If it is there, then make your custom query by altering hook.params.sequelize. Note: For aggregate functions you most likely will end up using sequelize.fn calls. Check the sequelize docs for syntax details (also providing a sample below).
  5. Don't forget to remove your custom param/flag, otherwise sequelize won't know what to do with it when it executes the query delete hook.params.query.$whatever;.

Here is the complete code for a hook like this.

const sequelize = require('sequelize');

module.exports = function (options = {}) { 
  return function customAgg(hook) {
    const groupBy = hook.params.query.$whatever;

    /* if our flag is present, customize the query. Otherwise just let it be. */
    if (groupBy) { 
      if (!hook.params.sequelize) {
        hook.params.sequelize = {};
      }

/* The following code would assemble a query like:
SELECT x, COUNT(x), AVG(y) as y, AVG(z) as z FROM [service_table] GROUP by x
In this case, the x column name is parameterized as the groupBy variable we picked up from our custom flag, and the y and z column names are hardcoded, but you could also parameterize them.
*/
      hook.params.sequelize = { group: groupBy };
      hook.params.sequelize.attributes = [ 
        groupBy,
        [sequelize.fn('COUNT', sequelize.col(groupBy)), 'count'],
        [sequelize.fn('AVG', sequelize.col('y')), 'y'],
        [sequelize.fn('AVG', sequelize.col('z')), 'z']
      ];

      delete hook.params.query.$whatever;
    }
// You could add more code here to catch for other flags/scenarios.
  };
};
daffl commented 7 years ago

I think one thing that keeps getting missed for some reason is that you can just as easily create your own services that do their own more complex aggregation. You don't have to rely on the built in adapters, just like you would write your own controllers or handlers in an Express or GraphQL API.

It is not uncommon to create separate Dashboard services that do that kind of data aggregation, in your example something like this:

class ReportingService {
  find(params) {
    // params.query has the query
    const Model = this.app.service('products').Model;

    // Here you can do anything you would do with plain Sequelize

    return Model.findAll({
      attributes: [
        { 'sum', 'amount1', 'total1'},
        { 'sum', 'amount2', 'total2'}
      ],
      groupBy: [
        { 'day', 'date_purchase'}
      ]
    });
  }

  setup(app) {
    this.app = app;
  }
}
pedrogk commented 7 years ago

That does look more convenient. I am still in that newbie state where I am dumbed down by the cli and not yet comfortable doing services from scratch. But I will give it a shot. Thanks.

UPDATE: Oh, figuring the cli also does scaffolding for custom services. What joy. :)

FredericLatour commented 7 years ago

@DesignByOnyx

From reading the conversation above, it seems that you don't entirely understand how sequelize and/or feathers works. That's not a jab at you, it's just an observation from having helped hundreds of people the last 2 years - it's easy to spot those who don't quite understand yet.

For sure but that's somewhat on purpose (well I may be stupid as well :)). I don't have the time to master (even superficially) both sequelize and feathersjs even though I'm interested by the concept. Therefore I need to determine if the product will overall let me do what I want in a faster way. That's obviously a compormise ... but a necessary compromise in an environment with hundreds of new technologies every day.

That said, if you read my message 26 days ago (https://github.com/feathersjs/feathers/issues/677#issuecomment-331676581), it looks like I'm starting to get things sorted out. Just that I kept getting confused by the information provided: for instance why would anyone implementing a group by on a hardcoded field in a hook? How would some parsing on my side would affect feathersjs own parsing... Maybe this is completely obvious for long time experimented users but not necessarily for the others ...

I mean there is certainly a chapter missing in the documentation - something like: how to superset feathersjs query in a generic way

It would certainly help to clarify everything. That's quite easy to understand what feathersjs do. And it seems to do great. However we need to ensure we can implement with a reasonable complexity what it can't do and that we need.

snewell92 commented 7 years ago

@FredericLatour

You mention that you were interested in something more generic to apply to a broad set of requests.

I do something similar for search fields that I have throughout my app, I basically have a hook that takes a property and then looks for a specific query (code is in typescript):

export const filterLikeQueryParam = curry((prop: string, hook: any) => {
  let qryObj = hook.params.query;

  if(!qryObj || qryObj.$like == null || typeof qryObj.$like != 'string') {
    return;
  }

  let searchTerm = qryObj.$like;
  delete qryObj.$like;

  if(searchTerm.length == 0) {
    return;
  }

  let whereObj = {
    [prop]: {
      $like: '%' + searchTerm + '%'
    }
  };

  if(!hook.params.sequelize) {
    hook.params.sequelize = {};
  }

  // TODO check for existing where clauses if/when code composes different LIKE queries
  hook.params.sequelize.where = whereObj;
});

If you wanted a general handleGroupBy hook you could probably write something similar, that would look for a particular query parameter, check it, delete it, then process it into the sequelize object to form the query.

Also, in my example I curry my hook so I can specialize it across different services with a different prop value, however that could be specified in the query as well. I just didn't want to let clients choose in this case. Whatever floats your boat and works.

To answer some of your questions directly - hardcoding groupby/like property is useful for specific services with limited scope. For example my hook above is curried, so I partially evaluate it with the property I want, basically just one step away from 'hard-coding'.

Your question about messing with feathers parsing is interesting... Feathers more or less (From my understanding) just passes all of the query parameters you give it to the data layer (sequelize or whatever else). Thus if you process it, just delete the properties you process, so feathers doesn't process them.

PS - I started using feathers just a few months ago, so I'm still quite fresh like yourself, but the community has been super helpful on Slack and here on github.

Best of luck!

lock[bot] commented 5 years ago

This issue has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue with a link to this issue for related bugs.