SkygearIO / features

Feature Tracking Repo for Skygear
Apache License 2.0
3 stars 12 forks source link

Re-design User Profile Discoverability #48

Closed chpapa closed 7 years ago

chpapa commented 7 years ago

Description

Currently there are multiple APIs like discoverUserByEmails, getUsersByEmail, queryUsersByEmail etc. It is very confusing. We should consider design a new set by considering the scenario of usages.

API Design

Scenario

Sample Code

Put sample code of how you vision this API will be used, consider different type of developers and different abstract level

API Design

PR:

Technical initial thought

Existing related API

Related Issues

Progress Tracker

Preparation

Implementation

Documentation

Release

Advice

tensiuyan commented 7 years ago

A use cases suggested by a user.

  1. User sign up with ‘username’, ‘nick name’ and ‘email’.
  2. able to query users list in SDK to display ‘nick name’ and ‘email’ as contact.
  3. able to change their ‘username’ , ‘nickname’ and ‘email’.
chpapa commented 7 years ago

Just jot down a thought when I discussed it with @cheungpat casually:

  1. Leave only 1 User Object, which is used for both auth and private user profile.
  2. Some fields like password are hidden (_password? Or shall User Object just own _user somehow?)
  3. Since we can't change the ACL system in short time, the User Object is a PrivateUserObject:
    • Private attributes
    • Queryable but not listable attributes.
  4. Since "Queryable but not listable" are not supported in current ACL, they are actually private attributes, but a special function (e.g. discoverUserByAttribute) can be used to query the user object.
  5. The concept of UserProfile become a PublicUserProfile, the API can support easy create of PublicUserProfile

In a long run, it is better to just have an ACL system that support column-based permission, and also support not just READ / WRITE but also QUERY actions.

Modification of existing API:

chpapa commented 7 years ago

Another idea is just to make User object very very special, a Public Record Object, but with special magic that make private attributes hidden from other users. The advantage is it is very simple and easy to understand, but a crazy exception in the system.

ben181231 commented 7 years ago

Scenario

API Design

After considering the scenario of usages, the following updates of APIs are proposed:

  1. Create a new Record Discovery API.
  2. Change current user record to profile record saving in public database.
  3. Chagne current _user object to new user record saving in private database.

The following are some facts about Record Discovery API, Profile Records and User Records:

Record Discovery API

Profiles Records

User Records

Sample Code


// User Case: Add custom fields to user record
skygear.signup({
  username: 'user01',
  password: 'some-password'
}).then((resp) => {
  const userRecord = resp.user;
  userRecord.set('sex', 'male');
  return skygear.privateDB.save(userRecord);
});

// User Case: User profile update
skygear.login({
  username: 'user01',
  password: 'some-password'
}).then((resp) => {
  const profile = resp.profile;
  profile.set('status', 'online');
  return skygear.publicDB.save(profile);
});

// User Case: User profile query
const profileQuery = new skygear.Query(skygear.Profile);
profileQuery.equalTo('status', 'online');
skygear.publicDB.query(profileQuery).then((profiles) => {
  console.log(`Online users: ${profiles}`);
});

// User Case: User record discovery
const userQuery = new skygear.Query(skygear.User);
userQuery.discover('sex', 'female');
userQuery.discover('location', 'HK');
skygear.privateDB.query(userQuery).then((users) => {
  console.log(`Hong Kong female users: ${users}`);
});
ben181231 commented 7 years ago

@chpapa @rickmak @cheungpat @b123400 Please give some comments on the proposal above.

chpapa commented 7 years ago

@ben181231 I think your proposal are pretty interesting. Especially I like user records works just like other DB records with DB hooks!

Some questions/thoughts:

  1. I don't think Record Discover API should work in this way, people don't expect it. For example, if I set private records with a "District" attribute. That doesn't mean I expect my apps can make all users in Tuen Mun district public. But in your design, userQuery.discover('district', 'Tuen Mun') is an exact match and will return ALL users who live in Tuen Mun.
    1. So maybe we should still make it explicit to set which attribute are queryable?
  2. How to handle the use cases of building a contact list for wechat-liked apps? It is likely in those apps we have to get users via public profile. For example, I might have an attribute age in Profile and search all users ages > 25. I will than have to store the list of Users as Profile records. However I would imagine the Chat plugins SDK will only take User as an parameter, not Profile; What should I do in that case?

So I guess there are two technical details need to be considered:

  1. Any other way we can avoid explicitly set which attributes are queryable? We can make it only applicable to User Record but not general I think.

  2. We need to consider User Record behave differently, when it is on the hand of logged in users and other users. Otherwise our API will need to accept both Profile Records (Public) and User Records (Private)?

ben181231 commented 7 years ago

What I think is to make the User Record as normal as possible. Otherwise, it would become the current User Object.

So maybe we should still make it explicit to set which attribute are queryable?

I think the rule that all fields beginning with _ cannot be queryable is enough for developers to design which parts of data is queryable. If we make it explicit, we would need to store the state and all SDKs may need to synchronize the state when it is changed.

chpapa commented 7 years ago

I think the rule that all fields beginning with _ cannot be queryable is enough for developers to design which parts of data is queryable. If we make it explicit, we would need to store the state and all SDKs may need to synchronize the state when it is changed.

I think the problem of using _ is it was already used as a way to mark columns/attributes maintained by Skygear, and users of Skygear are not supposed to touch them. Now if we tell developers they should use attributes start with _ to make a "really private" attributes, that is very confusing.

Any thoughts on my other comments/use cases suggested?

ben181231 commented 7 years ago

How to handle the use cases of building a contact list for wechat-liked apps? It is likely in those apps we have to get users via public profile. For example, I might have an attribute age in Profile and search all users ages > 25. I will than have to store the list of Users as Profile records. However I would imagine the Chat plugins SDK will only take User as an parameter, not Profile; What should I do in that case?

This problem comes from how we link User Record and Profile Record. In my thought, a user record and a profile record share the same record id if they are representing the same user. But I have no opinion whether we should provide some helper methods for this case.

chpapa commented 7 years ago

Thanks. Any other opinion? Maybe @ben181231 you could try to revise the proposal by illustrating the sample code of the use case of contact list, and also see if you have any idea for addressing overloading of the usage of fields beginning with _ / make discoverable fields explicit?

cheungpat commented 7 years ago

for @ben181231: To help me understand your proposal, I need these information. If you are not that thinking about implementation details, see below.

Questions:

Chagne [sic] current _user object to new user record saving in private database.

  1. Is this “user record” saved to the _user table or saved to other table?
  2. How will the _user table schema changes?
  3. If the “user record” has field currently in the _user table, will the client SDKs be able to fetch/query/set username, password etc?
  4. Will _user table add columns like _created_at, _owner etc?

Record Discovery API cannot match on reserved fields (fields begining with _).

Not sure what you mean by this... how will the table schema looks like after the proposed changes?

User Record can have reserved fields (fields begining with _) which are NOT queriable by other users.

In the current, Record DB, fields beginning with _ can be queried by all users (such as _created_at). Are you suggesting User Record to have different behavior for this?

Comment on Sample Code:

skygear.signup({
  username: 'user01',
  password: 'some-password'
}).then((resp) => {
  const profile = resp.profile // cheungpat: does this exists?
  profile.set('status', 'online');
  skygear.privateDB.save(profile) // cheungpat: what will happen if saving profile to private DB?

  const userRecord = resp.user;
  console.log("Password hash is: ", userRecord.password); // cheungpat: what will happen?
  userRecord.set('username', 'cheungpat'); // cheungpat: is this allowed?
  userRecord.set('sex', 'unknown'); // is this saved in `_user`.`_sex` or `_user`.`sex` in database table?
  skygear.publicDB.save(userRecord); // cheungpat: what will happen if saving user record in public DB?

  console.log("User Record ID: ", userRecord.id, "\nProfile Record ID:", profileRecord.id) // cheungpat: do they match?
});

const userQuery = new skygear.Query(skygear.User);
userQuery.discover('password', '$2a$10%...'); // is this allowed?
skygear.privateDB.query(userQuery).then((users) => {
  console.log(`users: ${users}`);
});

If you are not yet thinking about implementation, I think you can go even further (dream bigger) what you think the developer can do. Throw away the current legacy and we can see how the implementation can be changed so that we can achieve that.

ben181231 commented 7 years ago

By answering the following questions, I hope the proposal would be clarified.

How is the linkage of user record and profile record?

In developers' view, user records and profile records are 2 types of records and they are saving in private DB and public DB respectively. For developers who want to get user record from profile record, or get profile record from user record, we may provide some helper methods for them. The following snippet shows how the helper methods would be used:


// User Case: Query users using profiles
const profileQuery = new skygear.Query(skygear.Profile);
profileQuery.equalTo('status', 'online');
skygear.publicDB.query(profileQuery).then((profiles) => {
  console.log(`Online users: ${profiles}`);
  return skygear.usersOf(profiles);
}).then((users) => {
  const usernames = users.maps((eachUser) => eachUser.username);
  console.log(`Usernames of online users: ${usernames}`);
});

// User Case: Query profiles using users
const userQuery = new skygear.Query(skygear.User);
userQuery.discover('sex', 'female');
userQuery.discover('location', 'HK');
skygear.privateDB.query(userQuery).then((users) => {
  console.log(`Hong Kong female users: ${users}`);
  return skygear.profilesOf(users);
}).then((profiles) => {
  const onlineProfile
    = profiles.filter((eachProfile) => eachProfile.status === 'online');
  console.log(`Online Hong Kong female users count: ${onlineProfile.length}`);
});

The snippet above shows how developers query users using profiles and query profiles using users. Both information can be retrieved using 2 requests. It can be reduced to 1 request after we implemented the Inner Query feature.

Which records are queryable / discoverable?

In the proposal, how the record is queryable depends on which DB (public or private) it is saved and which field it is being queried:

The proposal tries to overload the _ prefix of field names of records on private DB. With this, developers have no need to configure which fields are queryable explicitly.

How is the database migration on current schema?

After thinking a little bit about the implementation, the following database schema changes are proposed:

  1. Rename _user table to _auth.
  2. Rename user table to profile.
  3. Create a new table user and add all default fields of a record table.
  4. Move _auth.username and _auth.username to the new user table.

After the changes, the _auth, user and profile table would be like:

_auth table (originally _user table):

Name Type Default Constrains
id text Primary Key
password text
auth jsonb
token_valid_since timestamp
last_login_at timestamp
last_seen_at timestamp

user table (newly created):

Name Type Default Constrains
_id text Primary Key
_database text
_owner_id text
_access jsonb
_created_at timestamp Not Null
_created_by text
_updated_at timestamp Not Null
_updated_by text
username citext Unique
email citext Unique

profile table (originally user table):

Name Type Default Constrains
_id text Primary Key
_database text
_owner_id text
_access jsonb
_created_at timestamp Not Null
_created_by text
_updated_at timestamp Not Null
_updated_by text

The reason for the changes:

ben181231 commented 7 years ago

For how to declare which fields of records on private DB, I am open to make it explicit.

After think roughly, the declaration would be something like the default ACL. Developers are expected to declare it on the SDK bootstrap stage and it would send to server on development mode (would be no-op for production mode).

In database, there would be a table recording whether every field of every record type can be queryable.

After that, it is like building a base for field-based ACL. Should we make it in this feature?

cheungpat commented 7 years ago

@ben181231 Thanks you for your table schema it all made sense to me now. I am open to breaking changes but we need to study more how much this will impact developer.

Clarifications

Move _auth.username and _auth.username to the new user table.

I suppose you meant _auth.username and _auth.email.

Also,

On _-prefixed fields

According to the current record:save action, the _-prefixed fields are blocked from modifications. That means the developer is not able to save to _sex through the record:save action (not unless we change the semantics and implementation of _ substantially).

By the way, the _-prefixed fields are queryable (e.g. filtering by _created_at)

On field ACL

I think having field ACL is an option we should consider because we don’t need to overload _-prefixed field for discoverability. In addition to discoverability, we may not need the profile record at all.

We can keep user record which contains public DB records of all users. The user record will have username and email columns migrated from the old _user table. The username and email column can be set to non-queryable, hidden and discoverable.

username and email on user table

When the username and email are on user table, they are subject to modification by the schema:* actions. We need to solve this problem. One of the following,

chpapa commented 7 years ago

@rickmak @cheungpat @ben181231 @b123400

To follow up with an offline discussion, it was concluded that a single user record in publicDB is most favorable and easy to use, with a Record-type field based ACL.

As discussed, I will try to generalize the ACL and propose the default ACL for user object, and discuss use cases / special exception.

Use Cases on User Object

For a single user object, we will need to use it for:

  1. Auth operation - sign up, log in, reset password, etc
  2. As a reference object from other objects - e.g. build a contact list
  3. Store user profile attributes private to the owner.
  4. Store user profile attributes public to everyone/someone.
  5. Allow everyone/someone to fetch your user object by an exact match of attributes (email address, URL slug, etc)
  6. Allow everyone/someone to query your user object by a query filter (all male users live in Hong Kong)

Generalized ACL to support

I would like to expand on what @cheungpat have proposed and a more generalized ACL by Record Fields.

API

setRecordFieldsACL(RecordType, UserRole, [Fields], Readable, Queryable, Discoverable, Updatable)

On UserRole:

On Access:

Optioanl

Updatable and Pointer Role are good to have but not necessary to fulfill all use cases:

Default

setRecordFieldsACL(*, EveryoneElse, *, True, True, True, False)
setRecordFieldsACL(*, Owner, *, True, True, True, True)
setRecordFieldsACL(User, Owner, [password, auth], True, False, False, False)
setRecordFieldsACL(User, EveryoneElse, [password, auth],  False, False, False, False)
setRecordFieldsACL(User, EveryoneElse, [username, email],  True, False, True, False)

Questions:

Sample Cases

Default

Class UserRole Field Readable Queryable Discoverable Updatable
* EveryoneElse * Y Y Y N
* Owner * Y Y Y Y
User EveryoneElse password N N N N
User EveryoneElse auth N N N N
User Owner password Y N N N
User Owner auth Y N N N

Make gender field private (not readable by others except owner)

setRecordFieldsACL(User, EveryoneElse, [gender], N, N, N, N)

Class UserRole Field Readable Queryable Discoverable Updatable
User EveryoneElse * Y Y Y N
User Owner * Y Y Y Y
User EveryoneElse password N N N N
User EveryoneElse auth N N N N
User Owner password Y N N N
User Owner auth Y N N N
User EveryoneElse gender N N N N

Make gender fields only readable by owner and friends

# Friends is a field with an array of User Reference
setRecordFieldsACL(User, ID_Friends, [gender], Y, N, N, N)
setRecordFieldsACL(User, EveryoneElse, [gender], N, N, N, N)
Class UserRole Field Readable Queryable Discoverable Updatable
User EveryoneElse * Y Y Y N
User Owner * Y Y Y Y
User EveryoneElse password N N N N
User EveryoneElse auth N N N N
User Owner password Y N N N
User Owner auth Y N N N
User EveryoneElse gender N N N N
User ID_Friends gender Y N N N

Make gender fields readable by everyone except Ben

setRecordFieldsACL(User, EveryoneElse, [gender], Y, N, N, N)
setRecordFieldsACL(User, Ben, [gender], N, N, N, N)

Make a URL slug field

setRecordFieldsACL(User, EveryoneElse, [slug], Y, N, Y, N)
ben181231 commented 7 years ago

Some comments about the field-based ACL:

  1. The field accessibility should be in a level instead of a option. For example, if someone is updatable to a field of a record type, he should be queryable to that field. This also aligns the current implementation of record-based ACL. So, there should be 5 levels of accessibility of record fields, from high to low listed as followed:

    1. Read Write (rephrased from "Updatable" aligning with record-based ACL)
    2. Queryable
    3. Discoverable
    4. Read Only (rephrased from "Readable" aligning with record-based ACL)
    5. No Access
  2. Anonymous User should be rephrased as Any User, if it means any logged in users.

  3. I think the usage of Everyone Else would be quite rare. By the suggested resolving order, the rules applying to Any User would overwrite the remaining role types like Owner and user defined roles. Applying rules to Any User should be the same as applying to Everyone Else, except that Everyone Else has higher access level than the role types above it, like a field of a record is queryable by everyone but not it's owner. That is a rare use case. So, should we support that?

  4. After considering the implementation, I think the variable role types (like Owner and Pointer Role) are difficult to achieve since we need to look ahead the data of the record to determine the accessibility of a specific field. This would be more complicated if we support pagination. Comparing Owner Role and Pointer Role, the former one would be easier since the format is fixed (it is a string) and we may generate jsonb operator for it.

chpapa commented 7 years ago

@ben181231

  1. Nice idea for 5 levels of accessibility. While the current model seems perfectly fit with it, I have at least one case: Not readable, but queriable (N,Y,N,N) -- Search for users between ages 20 - 35, but can’t disclose the exact ages of each users; I'm not sure if there will be cases of Not readable, but updatable (N,Y,N,Y)
  2. Anonymous User should mean any logged in users without username/email. Or signupAnonymously I think. Those should be for the case of, for example, if someone use Skygear to build a social network which only signed up users can see photos. (As practically Skygear users will always signupAnonymously to use DB Feature)
  3. I don't understand this point. However in the model, actually EveryoneElse is essential for most use cases. If you suggest to take it away can you suggest an alternative which able to model the use cases? (Or I'm guessing, do you just meant we can rename EveryoneElse to AnyUser but keep the meaning of it to all users that weren't specified by others rules?)
  4. I can imagine pointer type ACL is difficult to implement, could we looks into details of how to implement that and the performance implication before deciding we gave it up? Coz the use cases it might model sounds quite interesting.
ben181231 commented 7 years ago

Search for users between ages 20 - 35, but can’t disclose the exact ages of each users

Oh... Can we consider this as an edge case since this does not fit the current implementation of record-based ACL (i.e. 3 levels of accessibility - No Access, Read Only and Read Write) ?

if someone use Skygear to build a social network which only signed up users can see photos.

Skygear would treat anonymous user as a normal user. I don't think it is good to make anonymous user as a special case in our ACL system. For the use case, the social network still need some kinds of sign up process using email or username.

ben181231 commented 7 years ago

After the offline discussion yesterday, I have we have different views on how the ACL system handle different rules with overlapping scope, including those applying to specific field and to * field, including field-based rules and record-based rules. That is the cause of our different understanding of the EveryoneElse role. Could you clarify a little bit more?

@chpapa

chpapa commented 7 years ago

@ben181231

Oh... Can we consider this as an edge case since this does not fit the current implementation of record-based ACL (i.e. 3 levels of accessibility - No Access, Read Only and Read Write) ?

Why do we need to consider this an edge case? I don't see why it "does not fit" the current implementation of record-based ACL. Technically speaking every field-based ACL does not fit record-based ACL if you meant it doesn't fit the 3 levels of accessibility anyway.

Skygear would treat anonymous user as a normal user. I don't think it is good to make anonymous user as a special case in our ACL system. For the use case, the social network still need some kinds of sign up process using email or username.

Why is it not good? And how would you propose to solve the use cases mentioned above (build a social network which only signed up users can see photos) if we don't make anonymous user a special class?

In fact, we already treat anonymous user a special class, we have an API called signedUpAnonymously

After the offline discussion yesterday, I have we have different views on how the ACL system handle different rules with overlapping scope, including those applying to specific field and to * field, including field-based rules and record-based rules. That is the cause of our different understanding of the EveryoneElse role. Could you clarify a little bit more?

Can you be specific on what you need me to clarify? Or propose a case that my proposal will make users / logically confused?

chpapa commented 7 years ago

After an offline discussion with @ben181231 here is some agreed update:

  1. Since anonymousUser would have performance overhead on implementation, and the case seems solvable since Public User Read is the default, we've agreed to remove this role.

  2. Acknowledged that the default rule should be as follows, to avoid Field-based ACL conflicts with Record-based ACL

setRecordFieldsACL(*, EveryoneElse, *, True, True, True, True)
setRecordFieldsACL(User, Owner, [password, auth], True, False, False, False)
setRecordFieldsACL(User, EveryoneElse, [password, auth],  False, False, False, False)
setRecordFieldsACL(User, EveryoneElse, [username, email],  True, False, True, False)
  1. The resolution of ACL should be: Record ACL => Field ACL, so each operation need to pass both ACL to be accepted.
ben181231 commented 7 years ago

how would you propose to solve the use cases mentioned above (build a social network which only signed up users can see photos)

Just verified. Log in is NOT required to query records if those records are set to be publicly read. So, there is no need to sign up anonymously in order to see photos.

ben181231 commented 7 years ago

The following demonstrates how the user cases above are modeled:

Default

Class UserRole Field Level
* AnyUser * Updatable
User Owner auth ReadOnly
User Owner password ReadOnly
User AnyUser auth NoAccess
User AnyUser password NoAccess

Make gender field private (not readable by others except owner)

Class UserRole Field Level
* AnyUser * Updatable
User Owner auth ReadOnly
User Owner password ReadOnly
User AnyUser auth NoAccess
User AnyUser password NoAccess
User Owner gender Updatable
User AnyUser gender NoAccess

Make gender fields only readable by owner and friends

Class UserRole Field Level
* AnyUser * Updatable
User Owner auth ReadOnly
User Owner password ReadOnly
User AnyUser auth NoAccess
User AnyUser password NoAccess
User Owner gender Updatable
User Ref:Friends gender ReadOnly
User AnyUser gender NoAccess

Make gender fields readable by everyone except Ben

Class UserRole Field Level
* AnyUser * Updatable
User Owner auth ReadOnly
User Owner password ReadOnly
User AnyUser auth NoAccess
User AnyUser password NoAccess
User Owner gender Updatable
User ID:Ben gender NoAccess
User AnyUser gender ReadOnly

Make a URL slug field

Class UserRole Field Level
* AnyUser * Updatable
User Owner auth ReadOnly
User Owner password ReadOnly
User AnyUser auth NoAccess
User AnyUser password NoAccess
User Owner slug Updatable
User AnyUser slug Discoverable
ben181231 commented 7 years ago

The following pseudo code demonstrate how record-based ACL and field-based ACL are resolved:

function recordOperationVeridation (operation) {
  const currentUser = global.context.getUser();
  const originalRecord = operation.originalRecord;
  const operationType = operation.operationType;

  if (originalRecord.access.hasAccess(operationType, currentUser) == false) {
    throw new NoAccessError(currentUser, operationType);
  }

  const recordFieldAccess = originalRecord.class.fieldAccess;
  const operatingFields = operation.operatingFields;

  // not considered access depends on data of the record
  operatingFields.forEach(function (eachOperatingField) {
    if (recordFieldAccess.hasAccess(eachOperatingField, operationType, currentUser) == false {
      throw new NoAccessError(currentUser, operationType, eachOperatingField);
    }
  });

  return true;
}
cheungpat commented 7 years ago

Questions:

chpapa commented 7 years ago

Why do we need password and auth in user record?

Actually maybe it's a good idea to just hide them from users' access

cheungpat commented 7 years ago

Actually maybe it's a good idea to just hide them from users' access

Not sure what you think, but I think it is better to leave the password and auth field in _user table (or we can change the table name to call _auth like previously @ben181231 suggested).

ben181231 commented 7 years ago

As #61 suggested, I am going to send 2 PRs about the draft of Field-based ACL and User Profile Discovery respectively for further discussion.

rickmak commented 7 years ago

Not sure what you think, but I think it is better to leave the password and auth field in _user table

I think we will still keep the _auth and profile separation. And the field-base ACL will only apply to profile (also other Record). And the ACL is not going to affect all _ prefix table?

Record instance will not have per field ACL. Do @chpapa agree? (Just checking we are on same page)

chpapa commented 7 years ago

I think we will still keep the auth and profile separation. And the field-base ACL will only apply to profile (also other Record). And the ACL is not going to affect all prefix table?

Sure. Except I think it is called user instead of profile.

Record instance will not have per field ACL. Do @chpapa agree? (Just checking we are on same page)

I thought it was said it is good to have per field ACL also for other Records type?

chpapa commented 7 years ago

Remain portal UI and replaced by feature issue #84