matrix-org / matrix-js-sdk

Matrix Client-Server SDK for JavaScript
Apache License 2.0
1.49k stars 578 forks source link

Group call "PTT" is ambiguous in meaning #3516

Open HybridEidolon opened 1 year ago

HybridEidolon commented 1 year ago

https://github.com/matrix-org/matrix-js-sdk/blob/7d45947fb3dc4b0e291e03adac5348ad8eeaf32a/src/webrtc/groupCall.ts#L287

I'm currently looking at how other JS-based clients would implement group calling in matrix-js-sdk and I'm seeing a potential source of confusion for implementors around "PTT".

The meaning of "PTT" for "Walkie-Talkie Mode" as described in this blog post for Element Call is specific to a client mode of operation, but PTT outside of a walkie-talkie context has a different meaning. In many VOIP clients (Discord, Zoom, Mumble etc), using Push-to-Talk has nothing to do with the type of call but is instead a user preference where the client will not transmit audio unless a key is held, irrespective of others in the call using PTT. That feature wouldn't need clients to specify or observe io.element.ptt in the call state.

When using createGroupCall or constructing a GroupCall object, however, the parameter to enable this "walkie talkie mode" in clients that would support it is called isPtt.

From an API perspective, I think this should:

  1. Somehow be a separate extension when making a group call object in this API, and
  2. Use a clearer name (i.e. walkie-talkie mode is perfectly specific) that doesn't overlap with other uses of the term PTT.

That is to say, if this should even be part of matrix-js-sdk at all, since it's a namespaced extension. Is there a way that a client can attach and observe the additional io.element.ptt key on the state without it being a mandatory parameter in group call construction?

Furthermore, the requirement to disallow unmuting while another user is not muted is something the user interface has to implement. This isn't clearly indicated anywhere in this API yet, and could be a source of call disruptions if a participant uses a client that does not support this correctly (i.e. forcing other participants to mute themselves indefinitely because the joiner doesn't mute-by-default).

There is a use case in having "mandatory" PTT in a call but not requiring one-speaker-at-a-time semantics, too, which is a commonly used feature in large group calls on Discord and Mumble. Similarly there is an option in Zoom to require all joining participants of a conference to mute-by-default. Both are probably outside the scope of this issue, but maybe some food for thought.

wwwmaster1 commented 6 months ago

Great summary and valid concern. I would add that there may be confusion over what such a feature actually does on the server. Is it merely an audio file that is relayed? Or is it stored in a chat?