anvilresearch / connect

A modern authorization server built to authenticate your users and protect your APIs
http://anvil.io
MIT License
361 stars 85 forks source link

Filter returned users from anvil.users.list() #323

Open saikojosh opened 8 years ago

saikojosh commented 8 years ago

We need the ability to filter anvil.users.list() by the document properties. At the moment there's no way to specify options when querying the User collection: /routes/rest/v1/users.js:35.

Specifically we want to be able to:

This will help us reduce overhead because we won't need to query the entire user collection just to pull out a handful of documents.

hedleysmith commented 8 years ago

To expand on Josh's comments:

anvil.users.list() is a function in connect-nodejs, to allow options to be passed in to the REST API we would need to 1) define a consistent way of passing options 2) define which options are allowed 3) figure out how to best do the filtering

1) For filtering results, URL query parameters should do the job well

2) My thoughts on filtering options, which for the moment I'll only comment on for the route GET /v1/users:

3) Passing query parameters directly through as the options in User.list() could be a quick way to accomplish specifying no. of documents, would it also work for specifying a range?

I've got code working to specify the no of documents here:

https://github.com/hedleysmith/connect/commits/user-endpoint-options https://github.com/hedleysmith/connect-nodejs/tree/user-endpoint-options

Could add PRs if this seems like a sensible way to go, or could figure out more on the PRs first?

christiansmith commented 8 years ago

Let me give a little background on this subject while we're thinking it through.

The idea when we started building Anvil Connect was that user data would be limited in scope to identity-related attributes and leave domain-specific profile data and ad hoc querying to a separate microservice. There are a number of reasons this seemed like the way to go at the time.

Given that constraint, there would be no need for the general queryability offered by SQL or Mongo-style backends. We wanted to keep this light, fast, close to the metal. We expected if we needed more sophisticated indexing we're use something like ElasticSearch. The indexing currently done in Redis is limited to the simple lookups needed by program logic, and unfortunately there isn't a great way to scan across the records in a map-reduce kind of way.

After getting some real world experience and feedback from other users, we're changing our thinking on this. For many Anvil Connect users, there are domain-specific user attributes that could end up playing a role in access control (apologies for the pun, I couldn't resist) and with a very large number user accounts, "search results" can be useful. If we're going to allow for extensible user schemas, it only makes sense to have more flexible querying.

These things are being taken into careful consideration for the next generation of Anvil Connect. In the mean time...

To be pedantic, it's not quite possible to filter on a range of user IDs, because being UUIDs they are not sequential and at any rate I'm not sure Redis hashes are ordered by field. Requesting multiple users if you already have all the IDs is feasible. IIRC, User.get() can already take an array of IDs at an argument. Under the hood that method uses the Redis HMGET command.

Selecting specific fields only is certainly possible as well. There's a select option that can be passed to User.list() or User.get.

The only thing stopping us from a "multiget" of users, attribute selection, paging, and controlling the size of the result set via the API is mapping request params to options in the User method calls.

@hedleysmith It's imperative that we restrict which params can be used for options for security purposes. There are a few options that are really intended for internal use only.

Glad to pair on this with anyone that wants to put some effort into fleshing out this part of the API, and look forward to reviewing PRs. Thanks in advance.

hedleysmith commented 8 years ago

Hey Cristian,

Thanks for all the background info on this, very useful to understand. Also good to hear about the plans for the next iteration of Anvil Connect relating to these ideas.

Seems like this is all technically possible and that the underlying functionality in Redis / Modinha will already support what we want to achieve.

Yes lets set up a time to talk this through, I'll ping you on Gitter to figure out when would be a good time.

Just to recap, mainly for my own benefit, I think the main requirements and questions are:

PetrSnobelt commented 8 years ago

:+1: