Azure / azure-webpubsub

Azure Web PubSub Service helps you to manage WebSocket connections and do publish and subscribe in an easy way
https://azure.github.io/azure-webpubsub/
MIT License
132 stars 82 forks source link

Please create a method to list all users in a group or that are connected to the service. #325

Open amiddles opened 2 years ago

amiddles commented 2 years ago

The API would benefit greatly from a call to get a list of users and connections connected to a group or hub. When a client first joins a group, they may want to know who is connected. Also I may want to poll a particular group to see who is online at that time. This is more robust since it's tracked inside PubSub which already is maintaining such lists internally.

There is already a UserExists method so this would be a good comliment. I propose you add a method such

vicancy commented 2 years ago

In the current Web PubSub service, we actually don't preserve such a list. And the AddUserToGroup is actually a quick way to add current connections for this user to the group, which means, if a connection for userA connects after the AddUserToGroup(GroupA) call, this new connection is not joined to GroupA automatically. So actually the user can have a situation where some connections for userA belong to groupA while other connections for userA are not.

For the scenarios you have:

When a client first joins a group, they may want to know who is connected poll a particular group to see who is online at that time

So if it is some group chat scenario, I feel that such user-group map contains application logic, and sounds fair to have a database to store such map for your application instead.

amiddles commented 2 years ago

Could you please elaborate (and thanks for the reply).

Because you must keep a record of what connection goes with which user, can you not get a list of connections for the group and then join that list to the list of connections for the app to get the userIDs?

What are the technical reasons that the userID is not kept with the connection on the group? Is it because the userID is not a required parameter?

Given the above assumptions, I'd recommend that userIDs be kept with your connections on the groups and that you still add the methods mentioned above.

At a minimum, please provide a list of connections for a group and a method to lookup that connection to get any associated userID that was given for it. if you store the connection as [connectionID]|[userID] then you can still do any kind of checks for the connection IDs without degrading performance and if you are hashing them then just add them to a dictionary and track how many connections are opened or closed. Otherwise you can create a new dictionary to track them quite easily. You kind of do this already I'm sure.

I'm aware that a database can track the information, however that assumes that there is never an outage or disconnect between pubsub and the api endpoints. In that scenario the database will be stale compared to pubsub and out of synch. It is far better for pubsub to do that logic on it's end as it already has that basic data. You don't really know how the service can be used, but tracking variables that are associated with a user or connection is important.

Also I may need to know what connections or users are still in a group, or what groups are assigned to a user. In my opinion the service ought to be able to report that.

vicancy commented 2 years ago

Consider a chat room with users joining a group, I think the source of truth should be your database that maintains the user-group mapping instead of relying on the Web PubSub service to store the user-group mapping. Web PubSub helps you to manage the alive connections but does not store any data persistently. The Group in Web PubSub is a logical concept, it is some kind of "group session" that starts when some connection joins it and ends when there is no connection in it. Consider the chat room scenario, when there is a group chat, and everyone joins the group but offline, you don't want the group to disappear but still want the user-group mapping there, right? From this perspective, your "group" is not the same concept as the "group" in Web PubSub.

I'd also like to understand your concern about "stale data" in your database. Could you provide me an example of what would be out of sync?

amiddles commented 2 years ago

I think the issue is that you are assuming everyone is making a chat interface, but I am using this to move data notifications across a platform to connected clients that may want to know when a job is done, data has changed, or to indicate that they can take a phone call. What if I want to revoke security for a user but don't have their connection info? At the moment I literally have to check if they are in the group to do that. What happens if for my phone system I don't get the even to say they disconnected and now I have a customer alive in my queue waiting for an answer that won't come. Basically you have to broaden your understanding of how this can be used. It's a pub sub system, not a group chat system.

Also you just said that you keep a list of active connections, that's all an app needs you to return to make a decision. I want to be able to ask what all the active connections are for a group or the service as a whole. You already have the data ready to go, it should be easy as pie to add a method to return that information.

And as a customer of Microsoft, really you should be heeding your customer needs even if you disagree with them. We are the ones paying for the service so clearly we have a need for these kinds of features.

vicancy commented 2 years ago

Sorry for the late response, (just back from holiday).

Could you share more about your user scenario? We can also discuss more through email lianwei(at)microsoft.com or through meetings (we can arrange that through email too) on how to fit Web PubSub into your scenario.

abymsft commented 2 years ago

@vicancy I've posted a use case that needs a "getAllUsersForGroup" function here

vicancy commented 2 years ago

Quote here:

Scenario: User A, B & C joined a group (group is modifying a common resource/document). User A makes a change to the document which needs to be notified to the group (except User A). Question: How can the developer avoid sending the notification to "User A"? Comment: The developers do not intend to handle this scenario through UI code. If there were a function to retrieve list of users subscribed to a group, this situation could have been addressed.

Hi @abymsft, if you are using the subprotocol, there is a property called noEcho: image

For this scenario, I think it is actually not sending to "UserA" but not sending to the connection modifying the document. The same user but opened another browser is expecting to receive the updates too.

If you are not using subprotocol but REST API(server SDK), if we provide an "exclude" option when sending messages to a group,will that help?

Johno-ACSLive commented 6 months ago

Hi,

I've come across this thread needing this feature.

Suppose my service handles adding users to groups and assigning the connection "SendToGroup" permissions because I want to control which users are assigned to which group and only the users in that group can send messages.

Firstly I have a problem with this approach. We should be able to provide the "SendToGroup" permission to the userid so that in the case that the connection fails (as in client connection is not recovered but reconnected) as this will provide a new connectionid. It would be nice to configure it on the userid because we can add both connectionid and userid to groups so it makes sense to be able to do both for permissions (at least SendToGroup permission).

In any event if the person "created" a group by joining it and wants to ensure that no one is left in the group (essentially "deleting" the group) then we need to know who's in the group. Unless the team can implement a call to Remove all user id's and connection id's from the group and associated group permissions.

It was mentioned about keeping a database. However, this means keeping two copies of the data because Azure Web PubSub already knows what groups there are, what user id's have been allocated and what permissions have been allocated. Therefore I shouldn't need to double handle this information. Someone mentioned synchronisation could be out which is correct, what happens if a transaction fails and my service receiving events gets bounced. I shouldn't need further services to cache the data to send to a database or anything like that when the PubSub service already has those mappings - hence it was mentioned as the source of truth.

There are a lot of use cases for needing the below and as the information is already availaible in the service we should be able to perform the following:

vicancy commented 6 months ago

Hi @Johno-ACSLive , thanks for the details. Some follow up questions on the scenarios:

  1. In your scenario, your server add/remove users to the group and add/remove permissions to the users while your users send messages to the group, right?
  2. Are there needs that your server dynamically add/remove permissions for a user? We do have an API to "Remove connections/users from groups" https://learn.microsoft.com/rest/api/webpubsub/dataplane/web-pub-sub/remove-connections-from-groups?view=rest-webpubsub-dataplane-2023-07-01&tabs=HTTP , but we do not yet have an API to "Remove connections/users group permissions". We could add this one if needed.
Johno-ACSLive commented 6 months ago

Hi @vicancy,

The flow in my scenario is that a user will create a "session" (group in Web PubSub by essentially being the first to join it). They then own control over who can join the "session" (group). My custom backend service handles this (client sends user events to enact the appropriate action e.g. createsession, deletesession, joinsession,unjoinsession, bootuser etc.). All users who are in the "session" (group) can communicate with each other only. No one can join the session and publish messages unless the owner allows them to join.

The reason for this is that the user creating the "session" (group) will limit the number of users who will join and thus limit who is allowed to join the group as described above. The owner of the "session" (group) can also boot users from the "session" (group). The owner does not know what the connectionid's are. It only knows certain other information. The service I wrote to handle events doesn't store or track these details either since Web PubSub already does. It's just that the information is not available to query.

I've been mocking up to test PubSub's suitability but running into this limitation (and found this has already been raised but not addressed) in that Web PubSub knows which user has joined which group but we can't query the service to find out. I'd rather not duplicate that data in a database. At the moment the only thing I need to track is who is the owner (I'm not sure if Web PubSub tracks which userid first joined - if it does this would be useful to retrieve as well).

Ably provides this capability - their occupancy under the heading - "Retrieve presence members". They also have a heading further down called "Synced presence set" which is partly what we have to do now (although we need to use events 100% of the time to build and maintain the list for Azure Web PubSub at this point in time). Ably advise against this because they are already tracking everything - same as Web PubSub.

What I described above from Ably is just one example for what has been requested here but hopefully demonstrates why as well.

To answer your questions, for point 1 - yes. For point 2, this is already part of the Server SDK (both - e.g. I can call RemoveConnectionFromGroup or RemoveUserFromGroup as well as related async methods). There is also a call I can make to RevokePermission.

The problem with both GrantPermission and RemovePermission is that I can only specify a connection when I should also be able to specify a user. This is because I can specify both a user or connection when calling AddUserToGroup/RemoveUserFromGroup so should have the same options.

However, as described I'm not tracking which user has been added to a group nor which connection has been granted what permission to which group. Web PubSub does track this hence the request of capabilities I and others have mentioned. That way we are not duplicated data (potentially having stale data) as others have mentioned and as per Ably's point of view with their equivalent service.

I hope this helps move this forward.

vicancy commented 6 months ago

Thanks for the detailed explanation @Johno-ACSLive I understand your needs now that you'd like a way to remove connections/users and also the permission from a group except the "owner" user.

IMHO, having a method to "list all users in a group“ might not be a simple enough solution consider race conditions when there are new users coming in between "getting all the users in a group" and "removing users and permissions from the group". Some atomic method call to directly remove these users/permissions might fit more into your scenario.

We actually have REST API ready for such need to specify some filters when remove connections from a group https://learn.microsoft.com/en-us/rest/api/webpubsub/dataplane/web-pub-sub/remove-connections-from-groups?view=rest-webpubsub-dataplane-2023-07-01&tabs=HTTP, the SDK support is on-going, when SDK is supported, the method call would be similar to ServiceClient.RemoveConnectionsFromGroup(group: "group1", filter: "userId ne 'owner'"). We apply OData syntax to the filter as described here:https://learn.microsoft.com/en-us/azure/azure-web-pubsub/reference-odata-filter.

In your scenario another API is also needed, sth. like ServiceClient.RevokePermissions(permission: "permission", target: "group1", filter: "userId ne 'owner'")

How do you feel about the proposed APIs?

Johno-ACSLive commented 6 months ago

Thanks for the detailed explanation @Johno-ACSLive I understand your needs now that you'd like a way to remove connections/users and also the permission from a group except the "owner" user.

@vicancy I'd like to query groups for their list of current user id's, associated connectionid and permissions. I can already remove them via the Server SDK. It would be nice to know who "created" the group i.e. first user/connection joined but I don't need to exempt removal of the "owner".

IMHO, having a method to "list all users in a group“ might not be a simple enough solution consider race conditions when there are new users coming in between "getting all the users in a group" and "removing users and permissions from the group". Some atomic method call to directly remove these users/permissions might fit more into your scenario.

Should be simple enough, the service already knows and can just provide what it knows at that point in time. If there is a change then I can use event notifications if it's critical for whatever reason but in most situations I don't see that it would be required. If for example the service returns a permission but it had been removed between me obtaining the response and actioning it (very unlikely to occur), the Server SDK has checks (GroupExists, CheckPermission, ConnectionExists and UserExists). If there is a check that doesn't exist then Web PubSub will respond as such. I would typically call those checks first but even then I could check a connection exists and then the client disappears before I remove them from a group etc. so I need to handle such responses either way.

We actually have REST API ready for such need to specify some filters when remove connections from a group https://learn.microsoft.com/en-us/rest/api/webpubsub/dataplane/web-pub-sub/remove-connections-from-groups?view=rest-webpubsub-dataplane-2023-07-01&tabs=HTTP, the SDK support is on-going, when SDK is supported, the method call would be similar to ServiceClient.RemoveConnectionsFromGroup(group: "group1", filter: "userId ne 'owner'"). We apply OData syntax to the filter as described here:https://learn.microsoft.com/en-us/azure/azure-web-pubsub/reference-odata-filter.

That gets us part way there, however, doesn't address all cases. What if an event is missed, how do I signal to users in the group that a user has gone if it gets missed for whatever reason? Again in the case that the Client SDK doesn't recover but reconnects I need to check which group they were allocated to. There is currently no API for me to send a query to.

What if I want to build a management dashboard and provide statisics like how many groups there are and how many users and connections there are per group. What if we need to manage connections through a management interface to provide some facility that requires requesting information.

To summarise, we need API's and SDK updates that will allow the following:

olavt commented 6 months ago

Is there really no way to list all users for a hub today?

I have a use case where I would like to use Azure WebPubSub for a home automation scenario.

The users would be Home Automation controllers running on physical hardware at various locations. The controllers would connect to the server and a web application should be able to send messages to the controllers. The web application should be able to list the connected controllers and the user of the web application should be able to select a controller for management. The web application would then send messages and get responses from the controller.

Today I use Azure IoT Hub for this scenario, but I would like to have alternative implementations for this scenario.

Johno-ACSLive commented 6 months ago

Is there really no way to list all users for a hub today?

You won't find a call for it in the Server SDK and I havn't seen anything for it in the API documentation for the data plane.

olavt commented 6 months ago

Ok, then I can't use this service for my scenario.

Y-Sindo commented 3 days ago

Hi, we plan to add some REST APIs to address your requirements in the issue. Please check #799 for proposal to list users and connections in a group. Feel free to leave your comments.

Johno-ACSLive commented 2 days ago

@Y-Sindo its a start but more of what's been requested needs to be included. I've commented on 799.

Y-Sindo commented 2 days ago

@Johno-ACSLive In your scenario where you need to:

  1. Query groups for their list of current user id's, permissions.
  2. Recover the groups and permissions when the Client SDK doesn't recover but reconnects.

I believe additional persistent storage is necessary. For example, use a database to record the permissions, and groups of users. As our service currently doesn't store the customer data persistently, such data would be lost if a user doesn't have an online connection for a long time. If such kind of requirement is common, we've also considered a new feature called "Bring your own storage" that allows our customers to provide custom storage for our service to read and write.

Johno-ACSLive commented 1 day ago

@Y-Sindo since my backend service is handling management of connections to the service and permissions to groups, the Client SDK doesn't need to recover anything so point 2 can be ignored.

However, for point 1 I mentioned earlier that the service already knows about and is tracking these items so why shouldn't I be able to query it? I shouldn't need persistent storage if the service already knows what's been configured.

There is also some other queries I should be able to make as well since the service is already tracking these items.

Johno-ACSLive commented 1 day ago

Ably provides this capability - their occupancy under the heading - "Retrieve presence members". They also have a heading further down called "Synced presence set" which is partly what we have to do now (although we need to use events 100% of the time to build and maintain the list for Azure Web PubSub at this point in time). Ably advise against this because they are already tracking everything - same as Web PubSub.

@Y-Sindo this is the info relating to another product that offers the same service.