feat(populate): Support multiple 1:N relationships on key

Domiii commented 7 years ago

Currently, for each entry in the populates array one object is grabbed from a single path, which is then used to replace it's "lookup key" in the original object. However, I want to present another common usage scenario - multiple one-to-many relationships on a single (foreign) key:

Scenario example

Data

/submission/$(submissionId)
/feedback/$(feedbackId) where each child has submissionId
/notifications/$(notificationId) where each child has submissionId

There can be multiple feedback and multiple notifications entries for each submission.

Grabbing all `feedback` and `notifications` with each `submission`

From what I understand from the v1.5 updates, there is (will be) a way to provide a function for populates that has key and obj parameters that returns the configuration array at query time. While that gives us the submissionId, I don't think it allows us to create a full-on query?

I am thinking of something along the lines of:

{
  path: 'submissions',
  populates: (submissionId, submission) => [
    {
      root: 'feedback',
      targetKey: 'feedback',   // will store the feedback data at the `submission`'s `feedback` property
      queryParams: [ `orderByChild=submissionId`, `equalTo=${submissionId}`]
    },
    {
      root: 'notifications',
      targetKey: 'notifications',   // will store the feedback data at the `submission`'s `notifications` property
      queryParams: [ `orderByChild=submissionId`, `equalTo=${submissionId}`]
    }
  ]
}

NOTE: In this case, there is no child, and instead we get a queryParams because we want to get all matching children in the given root path.

prescottprue commented 7 years ago

@Domiii Is there a reason you can't store a list of $key: true notifications and feedback items on the submission? Then you would populate those parameters (which would work currently)?

Example Submission

{
   text: 'some text in the submission',
   feedback: {
    $feedBackItem1: true,
    $feedBackItem2: true
  },
  notifications: {
    $notificationId1: true,
    $notificationId2: true
  }
}

Example Collections

{
  feedbackItems: {
    $feedBackItem1: {
      text: 'Some feedback',
      owner: $uid
    }
  },
  notifications: {
    $notificationId1: {
      text: 'Some notification'
    }
  }
}

Then this way you can do populates like so:

const populates = [
  {
     root: 'feedbackItems',  // collection on Firebase
     child: 'feedback' // parameter on submission
   },
   {
     root: 'notifications', // collection on Firebase
     child: 'notifications' // parameter on submission
   }
]
@firebaseConnect([
  {
    path: 'submissions',
    populates
  }
])
@connect(({ firebase }) => ({
  submissions: populatedDataToJS(firebase, 'submissions', populates)
})

This way you aren't querying for a matching parameter on each child, instead you are doing a lookup of each child by key.

Domiii commented 7 years ago

I'd rather simplify and avoid duplication: using your suggestion would require duplicate data, i.e. adding the feedback and notification keys to my submissions...

prescottprue commented 7 years ago

@Domiii That pattern is mentioned here in Firebase's docs, I believe they call it "data fanning".

It is "avoiding duplication" to just store the keys as you are going to display them (i.e. a users notifications or submissions). The whole objects do not have to be stored, just a list of keys.

What you are mentioning would not be as performant due to the queries instead of key/value pairs.

Domiii commented 7 years ago

@prescottprue If you have an index on a given key, filtering on that key is going to be just as performant as querying by path, in any well designed database.

Most databases are implemented as hash tables for direct key lookups or B+ trees for range queries.

An index would add a separate data structure, keyed by the given child property, and whose values would be pointers to the actual data. Both methods should observe similar speeds.

However, I could see that the data might be stored in a hash table while indexes might be organized as B+ trees (because indexes need to support range queries). Performance should be similar though, even though hash tables are O(1) while B+ trees are officially O(log n) (with a very very large logarithmic base though, limiting tree height usually to a single digit number).

One thing is for sure: In huge database, scalable solutions, the biggest issue is data locality. And there, indeed could still be difficulties to get index performance up to speed, so maybe these assumptions are indeed a bit much, but at least, in an ideal case, index look up should never be noticeably outperformed by lookup-by-path.

The Firebase query language is a powerful, yet slick and efficient tool in our toolbox (while it certainly could/should have even more power). We should not shy away from using it in the right circumstances. :)

prescottprue commented 7 years ago

@Domiii I have no way of knowing if that assumption is true, but good thought experiment either way.

Though this does not match the original design of populate (meant to replace keys object with values) it is an interesting use case. Open to pull requests, but not totally sure how much of an internal change it would be.

Domiii commented 7 years ago

Ok, thanks for an open mind!

I'll see what I can do with the limited coding time I have these days hah!

BTW: Also I revisited my assumptions on how firebase is implementing things from yesterday. Also, according to this Quora entry from 2014, firebase uses (used) MongoDB for storage. Once I have some more time, I'm sure to investigate further!

mike623 commented 7 years ago

I am similar question for example on my /firendList

{
    $userId1: {
      $firendId1: true,
      $firendId2: true,
    }
}

and all user stored on /users

how can I populate all subkey of $userId1 only ?

compojoom commented 7 years ago

@prescottprue - the default populate works just fine when we use it as you've described. But where it fails is when you have private data. If your rules are similar to this:

{
  submissions: {
    uid: {
      read: true
    }
 }
}

One won't be able to match anything, because trying to load the submissions would cause an unauthorized error. In this case the populate should load the items by key and not as a whole collection. In addition to this if you are trying to load a child of the submisison object we are still downloading the whole submission object. If you use the submisison object to be collection of items related to this submission than you run in performance issues since you would be downloading everything from the firebase server.

ioannistsanaktsidis commented 6 years ago

@compojoom did you find any solution on how to get back only the populated object of the whole object ? I have a similar problem where I have this kind of data:

and I am populating according to ID, yet I get back in return the whole object. Here is my code:

const enhnace = compose(
    firebaseConnect(props => (
        [
            {
                path: 'events_categories',  populates: [{child: `${props.navigation.state.params.id}`, root: 'events' }]
            }
        ]
    )),
    connect(
        ({firebase}, props) => ({
            result: populate(
                firebase,
                'events_categories',
                [{child: `${props.navigation.state.params.id}`, root: 'events' }]
            ),
        })
    )
)

and this is what I get back if I populate by the 3d ID lets say:

I would like only the one that was populated to be returned. Anyone has any similar issue? Any help is much appreciated.

cc @prescottprue .

Thanks in advance!

prescottprue / react-redux-firebase