Support more complex populates

ekryski commented 8 years ago

This hook (https://www.npmjs.com/package/feathers-populate-hook) has come up in the community and does some great things but also I think overcomplicates things a bit.

A few things that we've seen a need for through experience or what people have mentioned:

The ability to specify a location on the hook object to put the populated object. (ie. params, data, result). We frequently will populate an entity into params so that fields can be referenced for later but they are not the entity that is actively being modified. An example, would be:

After a payment we need to email a receipt. So we need to have the products bought, the user to send to, and the company they belong to in order to get the shipping address. The order is the entity being modified by a create but we need to populate both the user and the business to hook.params for reference when setting up the email payload.
Being able to select fields from a populated entity. (this could just be a second hook that plucks those attributes, and likely should be)
Being able to populate multiple entities (ie. user and business). I think this can be resolved by simply registering the populate hook twice.
Being able to populate multiple of the same entity (ie. all comments on a post)
Being able to allow the client to specify what to populate and which fields via $populate. This would be very useful but starts to tread into GraphQL territory and might become unnecessarily complex or an expensive operation.

Now this may be more of a meta issue to discuss the possibility and viability of supporting these requirements. Not all of the requirements are valid in my mind or can be easily solved as I had mentioned above.

Proposed Interface

commonHooks.populate({
    user: { // destination key
        service: 'users', // can be a string or an actual service object
        field: 'adminId', // field to use for query, only required if different from destination key
        on: 'result' // where to put on the hook object. could be 'data', 'params', or 'result'
        multiple: true // defaults to false and returns one,
        query: { // defaults to {service.id: this.field} but can be any standard Feathers query object
            $select: ['name', 'email']
        }
    }
})

eddyystop commented 8 years ago

We should consider if we want to support arrays of unknown size. Let's say we want to join the Parts item to the componentParts below joining on partId.

finishedPart: {
  componentParts: [
    partId: 'aa'
  ]
}

Of course an array element may itself contain an array.

I've already been asked a similar question on #help.

ekryski commented 8 years ago

@eddyystop I'm not sure I follow that example. Are you saying something like:

blog: {
  posts: [
    {
      title: 'post 1',
      comments: [
        { text: 'comment 1 on post 1'},
        { text: 'comment 2 on post 1'}
      ]
    },
    {
      title: 'post 2',
      comments: [
        { text: 'comment 1 on post 2'}
        { text: 'comment 2 on post 2'}
      ]
    }
  ]
}

where if the developer wanted to populate all the comments for every post?

eddyystop commented 8 years ago

How about this

db = {
    blogs: [
      { authorID: 'eddyystop', postIDs: [ 111, 222 ] },
      { ... }
    ],
    posts: [
      { postID: 111, title: 'title 1', body: [], commentIDs: [ 11101, 11102 ] },
      { postID: 222, title: 'title 2', body: [] },
    ],
    comments: [
      { commentID: 11101, body: [] },
      { commentID: 11102, body: [] },
    ]
  }

blogs.find({ query: { authorID: 'eddyystop' }}

and now you want to join all the posts using postIDs, then join all the comments to their posts, so you end up with

result = {
    authorID: 'eddyystop', postIDs: [ 111, 222, 333],
    posts: [
      { postID: 111, title: 'title 1', body: [], commentIDs: [ 11101, 11102 ], comments:
          [
            { commentID: 11101, body: [] },
            { commentID: 11102, body: [] },
          ]
      },
      { postID: 222, title: 'title 1', body: [], }
    ]
  }

The point is there are a variable number of postIDs and a variable number of comments per post.

So we have to point to an array, not a single value, and say populate these. Its easy to point to postIDs but harder to point to commentIDs for each element in posts.

So we might end up with a notation like: populate postIDs[] and later populate postIDs[].commentIDs[] where we explicitly indicate which are array. Or postIDs.commentsIDs where we dynamically see they are arrays. Or we have nested join instructions where the nesting provides the same info as above.

My sense is that the more explicit we are about is an array the easier and less error prone it is.

eddyystop commented 8 years ago

A separate consideration is that a base item may be populated differently for different needs.

Consider a Purchase Order. Accounting is likely to want to populate the PO with the multiple invoices paying the PO. Receiving however would want to populate the PO with the multiple receiving documents generated as the PO was being fulfilled. There are many such different views in a line-of-business system.

Different populates can be done using commonHooks.iff(condition, populateHook) though we'd likely want an enhancement to support commonHooks.iff(condition, populateHook1, populateHook2, ...).

Let's call each potential way of populating a View (to misuse a RDBMS term). It may be cleaner to define the different Views in an View object:

// in purchaseOrders service
const views: {
  accting: [ hooks to populate PO the way accounting likes ],
  rec: [ hooks to populate PO the way receiving likes ]
};

We can then have a hook like commonHooks.populate(views.accting) and of course the param can be dynamic.

Better still, perhaps we can find some way for the the service method call to specify the view it wants e.g. purchaseOrders.get(id, { view: 'accting' }). Right now I believe that 2nd param is ignored on the client side.

p-diogo commented 8 years ago

I'm liking where this is going. A stronger populate() is definitely needed (came into this issue when noticing the lack of dot notation and arrays support).

So we might end up with a notation like: populate postIDs[] and later populate postIDs[].commentIDs[] where we explicitly indicate which are array. Or postIDs.commentsIDs where we dynamically see they are arrays. Or we have nested join instructions where the nesting provides the same info as above.

My sense is that the more explicit we are about is an array the easier and less error prone it is.

Not sure if it makes sense, but how about adopting the $ placeholder for arrays? MongoDB is using it , so I'm assuming most developers will be familiar with it...

eddyystop commented 8 years ago

@PedroMD what would postIDs[].commentIDs[] be in $ notation?

beeplin commented 8 years ago

It's such a coincidence that I am just writing my populateToData hook which populates to hook.data rather than hook.result and as well supports selecting fields to populate . And I am thinking of writing hooks like populateArray, populateArrayToData. Would be wonderful if all these are included into one essential populate hook.

beeplin commented 8 years ago

I agree with @ekryski 's initial post except for one point:

Being able to select fields from a populated entity. (this could just be a second hook that plucks those attributes, and likely should be)

I don't think it looks neat if we populate a record with something unnecessary and then use another pluck hook to remove them. It's much better to just provide a query option and to use $select to do this.

In other words, a populate hook means two things: 1) to execute a find operation on a service; 2) to add the result data of the find operation to some place in the current record. Therefore, a populate hook must have full features find has, that is, full query syntax support with $select, $limit, $sort, $skip, $gt, etc.

eddyystop commented 8 years ago

I've been thinking about reducing repetition when you have multiple populates. There can be multiple populates on one service, depending on what "View" is needed. You also have multiple populates when you join the same tables on more than one service.

We may allow a central definition of how tables link to one another. Most simplistic might be something roughly like this. Relation is for informative purposes only and optional.

const joins = {
  'blogs=>posts': { find: (...) => { ... }, cardinality: '1:m', relation: 'contains' /* blog contains posts */ },
  'posts=>blogs': { find: (...) => { ... }, cardinality: '1:1', relation: 'belong to' /* posts belong to blog */},
  'posts=>comments': { find: (...) => { ... }, cardinality: '1:m', relation: 'has' /* post has comments */},
  'comments=>posts': { find: (...) => { ... }, cardinality: '1:1', relation: 'belong to' /* comments belong to post */},
};

There would be no difference in the populate hook specifying joins['blogs=>posts'] or { $select: (...) => { ... }, cardinality: '1:m }. The point however is that with the former we do not need to repeat the definition every time.

This can be generalized. We can have a central definition of Views, a view being a join of at least 2 tables (using the joins idea above). We could have a populate hook where the join is views.postsAndComments or it can be the hardcoded details. We may select fields at this point.

The point of this post is that if we design the API properly we don't have to keep repeating the details for every populate hook. We just refer to where the details are. It comes down to the API being an object where some of the properties are themselves objects. So, nothing strange.

eddyystop commented 8 years ago

@beeplin You may want to see if the new feathers-service-logger-stats is of use to you.

beeplin commented 8 years ago

@eddyystop thanks! this logger is quite usable. ;)

kaiquewdev commented 7 years ago

The module aims to be a tool to populate the application in a usual way?

kaiquewdev commented 7 years ago

Whats is the main purpose for this module?

ekryski commented 7 years ago

@kaiquewdev yes. To be able to easily populate deeply related entities across the DB adapters. With the current hook you can only go 1 level deep. @eddyystop has been working on a prototype version in another repo. https://github.com/eddyystop/feathers-test-populate-etc

eddyystop commented 7 years ago

We're quite advanced with this in https://github.com/eddyystop/feathers-test-populate-etc

thelonglqd commented 7 years ago

Does any way to sort with populated field in feathers 2.0.3 ?

eddyystop commented 7 years ago

Could you please provide an example? Thanks.

On Thu, Mar 30, 2017 at 5:25 AM, Long Nguyen The Phung < notifications@github.com> wrote:

Does any way to sort with populated field in feathers 2.0.3 ?

— You are receiving this because you modified the open/close state. Reply to this email directly, view it on GitHub https://github.com/feathersjs/feathers-hooks-common/issues/23#issuecomment-290353883, or mute the thread https://github.com/notifications/unsubscribe-auth/ABeznzD9YrvhcJTMlnLjjpEYGUdnyifOks5rq3UXgaJpZM4KhXz6 .

thelonglqd commented 7 years ago

@eddyystop: My case is like this: A: {}, B {name: "abc"}, B is populated field (has relation with A), and I need to sort with name in B.

eddyystop commented 7 years ago

Would you mind opening a new issue for this so we can keep track of it, instead of it being lost within here? I'll be addressing hooks after the Auk release lands.

feathersjs-ecosystem / feathers-hooks-common

Support more complex populates #23

Proposed Interface