element-hq / element-web

A glossy Matrix collaboration client for the web.
https://element.io
GNU Affero General Public License v3.0
11k stars 1.96k forks source link

Push-to-talk in video/voice conferencing #5993

Open BloodyIron opened 6 years ago

BloodyIron commented 6 years ago

Whether it's Freeswitch or Jitsi (AFAIK Freeswitch is on the way out), having Push-To-Talk for channel chat would be seriously appreciated! Right now if you have a channel with a whole lot of people wanting to talk, it can turn into a really big mess.

Also, this PTT feature would need to work when the "app" doesn't have focus. I'm thinking this may not work unless the "desktop" version is used, but I'm not 100% up to speed with browser features like this.

It might also be a good idea to get a keyboard bind to toggle self-muting of your own microphone. Let's say, if your fat cat knocks a glass of water all over your desk, oh noes!

t3chguy commented 6 years ago

Also, this PTT feature would need to work when the "app" doesn't have focus. I'm thinking this may not work unless the "desktop" version is used, but I'm not 100% up to speed with browser features like this.

yes this would not be possible in-browser, Discord has the same issue

BloodyIron commented 6 years ago

@t3chguy yeah, I wasn't sure if that particular detail was "solved" by browser devs since I last looked at it. ;)

t3chguy commented 6 years ago

as far as I'm aware its done by intent, so web applications can't behave like keyloggers

BloodyIron commented 6 years ago

@t3chguy that... is actually completely reasonable! I hadn't thought of that. Please don't keylog me Facebook! ;D

turt2live commented 6 years ago

ftr this is listed in the voip metabug: https://github.com/vector-im/riot-web/issues/3025#issuecomment-287929855

BloodyIron commented 6 years ago

@turt2live when I found that thread, I wasn't sure if the PTT facet would get lost in the size of that thread. :( So, I opened this one.

Along the lines of the reason why we have individual issues, instead of just one big meta issue for every single thing people want. ;P

turt2live commented 6 years ago

I'm not declaring a duplicate, just providing a reference so people can find it more easily.

BloodyIron commented 6 years ago

@turt2live you monster! :O

josephtocci commented 6 years ago

Nextel PTT (Always online, push notifications with audio, Example: Zello) Remember this? I was too young to have a phone, but my parents each had one and the PTT was far above everything else at the time. There were no smartphones at the time. It was faster than calling, and usually the message got through. These days texting won out, partially because if you put your phone down and came back to it you could just read it. PTT you might miss. The other reason texting won out is because you don't necessarily want the person you are talking too IRL to hear what the person on the phone said. There are pros and cons to this method. There is a decent following on Zello which does this. In fact I know someone who uses Zello with all his camping buddies.

Mumble PTT (Must be running, no push notifications for audio, login/join/accept call to use and hear PTT) This one is more what people will do now. Especially gamers. Even my friend on Zello would probably use this instead. The way he uses Zello is to schedule an hour with everybody, and everybody goes online at the same time and talks. There is no reason to use the always on Zello if you are just going to chatroom with voice. Basically you just mute the microphone, and have a keyboard button to hold down to unmute. (Headphone button on phone) If someone is trying to be quiet IRL and can't get around it, he can plug his headphones in to hear, and type in the room instead of talking. Optionally a text-to-speech thing would be cool, but that is a separate subject.

My thoughts I think the Mumble type PTT is what people actually seem to need. You set up a room for the game, and everyone logs on and plays the game. (Not necessarily games but games is the common example) Text is king unless you are doing something with your hands, if you are doing something with your hands, set up a call. It's not complicated.

Forward So I recommend a keyboard button for desktop and headphone button for phone would be excellent shortcuts to temporarily unmute the microphone while it is held down for both native calls and jitsi conferencing. I would recommend Riot to certain people if it had that.

t3chguy commented 6 years ago

headphone button for phone

/me points at vector-im/riot-ios and vector-im/riot-android

josephtocci commented 6 years ago

Android issue created: https://github.com/vector-im/riot-android/issues/1905

iPhone issue created: https://github.com/vector-im/riot-ios/issues/1736

josephtocci commented 6 years ago

TL/DR, turn the camera on when you press the unmute button, unless you want an audio only conference call. Would reduce load

You could even make the same button to unmute the microphone to turn on the camera. I don't necessarily want my camera on all the time, but when I'm talking I want it on. So have the camera on but don't send anything to matrix unless I hold down the PTT button to unmute the microphone. You could make this the default for conference calls and not for 1-on-1. Does that solve the infrastructure problem? Maybe sometime in the future add the Nextel PTT, it is awesome, but if you say it's too much for now then ok.

It reduces load, and you can have an audio PTT only/no camera conference option for gamers, and a full conference call option for people that have 20 eyes to look at 20 different camera feeds for their friends. lol. Probably the best option considering infrastructure, and nothing has to change server side because load should go down even if you add a healthy amount of users.

BloodyIron commented 6 years ago

So any word when we can see push to talk for riot?

BloodyIron commented 6 years ago

Where is this on the radar? I know it's listed p3, but I'm not sure how that falls in terms of timeline of being looked at.

t3chguy commented 6 years ago

This would have to be implemented by Jitsi. (as Jitsi is what we use for Conferencing)

BloodyIron commented 6 years ago

Any chance the Riot team can co-ordinate that with Jitsi?

BloodyIron commented 5 years ago

Well, hoping I can appeal to them : https://github.com/jitsi/jitsi-meet/issues/3464

If anyone wants to help, please post her and in the linked jitsi thread. This really needs to get implemented. Not being able to change the shortcut is pretty unacceptable.

anoadragon453 commented 5 years ago

Just a heads up that I'm making major progress with this, and hope to have basic PTT functionality implemented very soon :)

anoadragon453 commented 5 years ago

Got a very good chunk done tonight. Getting the functionality working required some extra PRs in other projects, but now I have everything I need to see it through.

Just working out the last fiddly bits, but I will do so tomorrow :)

anoadragon453 commented 5 years ago

So I've managed to get proper functionality working! You can see a short and silly demo here: https://matrix.to/#/!DdJkzRliezrwpNebLk:matrix.org/$1541185902690rkqXW:amorgan.xyz

There's currently two problems however:

  1. You're currently unable to edit the shortcut
  2. The shortcut is registered when you add a Jitsi widget, not when you join in to the call, meaning it might not register in all cases

I plan to fix both of those (hopefully today) by integrating into Riot's settings, which I should be able to use to just register the shortcut on start-up and/or when the functionality is enabled.

Accomplishing all this required some PRs in an extra repo, notably https://github.com/WilixLead/iohook, and as such we're currently waiting for those to make it to master, which one already has.

So yes, functionality is nearing close, exciting!

BloodyIron commented 5 years ago

@anoadragon453 wait... you can link to Riot URLs, didn't even know that...

As for your work, woot! Awesome!

Does this work when the app is minimized/not in focus?

Also, for the shortcut change, are you going to push that upstream to jitsi as per your comment in : https://github.com/jitsi/jitsi-meet/issues/3464 ?

Woot! Thanks for your hard work @anoadragon453 ! :DD

anoadragon453 commented 5 years ago

Thank you!

Does this work when the app is minimized/not in focus?

Yep!

Also, for the shortcut change, are you going to push that upstream to jitsi as per your comment in : jitsi/jitsi-meet#3464 ?

So turns out the jitsi people were right and we can just use their iframe API to trigger the change. They do, unhelpfully, only have a function for toggling audio, so for muting we first have to query whether the audio is already muted, and if not then toggle.

So essentially what I'm doing here is useful for electron apps that want to embed Jitsi and have PTT functionality to control it, but I'm not actually changing anything in the jitsi project itself.

BloodyIron commented 5 years ago

@anoadragon453 okay but I was also asking about the changing of the PTT key binding, my understanding is that needs to be added upstream to jitsi, is that not the case?

anoadragon453 commented 5 years ago

Ahh, well since we're controlling it from Riot the keybinding actually happens in Riot. We aren't sending a "space" to Jitsi, we're sending a command to toggle mic audio from their API, so making the existing keybinding changeable in jitsi was not something that was necessary to accomplish it in Riot.

However, while jitsi people have said their shortcut code isn't very extensible, I assume someone could probably hack in a new item into the menu to change the shortcut.

Not promising anything on that end, but I am learning a lot about how a process like that may go with this project, so perhaps one day in the future?

In the meantime, use Jitsi through Riot and everything will be wonderful :D

BloodyIron commented 5 years ago

@anoadragon453 I primarily care about riot, I just didn't realise we could change the keybind from the riot end. I like where this is going! Yay! :D

anoadragon453 commented 5 years ago

If you want an update, I'm currently getting close to implementing enabling the functionality and changing the shortcut in settings.

How to record what shortcut the user is currently pressing is still a bit of an unknown for me. I might try to make it work with iohook, or perhaps instead use a separate library as the user will have Riot in focus anyways. The only issue with the latter is iohook seems to use a different encoding for keys than most other libraries, so Shift - A in one lib may be Ctrl - 7 in iohook.

Anyways, I'll try stuff out and report back on how it goes!

anoadragon453 commented 5 years ago

Alright! So after the work of tonight we have a nearly complete spot in Settings for push to talk now. It has a enable/disable toggle, a display of what the current shortcut is, and a button to set that shortcut. One thing that still needs to be fixed on the enabling/disabling front is enabling the shortcut on startup if you've enabled the option before already. Not entirely sure where to do this yet, but I'll likely find a spot soon.

In addition, I still need to figure out a way to record what shortcut keys are currently being pressed and then assign that to activate push-to-talk, but with the way I've set things up it should be as simple as dropping in a library/function into the right spot.

Some additional things that may be useful (but possibly not in scope of this PR) are:

But other than that, things work as expected!

ara4n commented 5 years ago

(one thought: please make sure that the iohook stuff is only enabled if the user has explicitly turned on PTT. and please apply all possible paranoia to minimise the risk of a JS exploit in Riot somehow giving an attacker a route to turning the app into a systemwise keylogger through the magic of iohook...)

anoadragon453 commented 5 years ago

@ara4n Very good point! Selective enablement can certainly be arranged. I'll need to look in to how far a JS exploit in electron could go. If it would allow for IPC calls then that would not be fun.

ara4n commented 5 years ago

i haven't looked into how iohook works at all, but i'm assuming that you're configuring it from JS to say "oh hai, please tell me whenever the user hits alt-space no matter which app is in the foreground". at which point a JS XSS could presumably say "oh hai, i'm interested in all the other keycodes too". So i'm wondering the native code of iohook has some way of being hardcoded to prevent that from ever happening (e.g. only ever letting you register a single keycode to intercept).

(This said, a JS XSS in electron is very bad news anyway, given there may be exploits which can escalate into running native code too, like the https://electronjs.org/blog/chromium-rce-vulnerability trainwreck. But we should still try to safeguard a trivial XSS from being able to break outside the web env via native extensions like iohook...)

anoadragon453 commented 5 years ago

Seeing as I've thrown a few PRs at iohook already, I could easily add one that limits how many shortcuts you can set at once, perhaps in the package.json? Is that modifiable from electron?

BloodyIron commented 5 years ago

Oh, also, any chance we can get Mouse button support for PTT? :)

anoadragon453 commented 5 years ago

We're definitely aiming for it!

anoadragon453 commented 5 years ago

Alright, so push to talk in Settings is fully functional now. However, while it saves and registers the shortcut, iohook won't actually pick up on it because it listens for keycode instead of rawcode in keyboard events. I've created an issue to fix this and if not given much fanfare will create a PR for it soon as well. A release for iohook still hasn't gone out since my other PRs, so hopefully this one will make it in before a new release does go out.

But yes, with that we will have keyboard shortcut recording (already done) and then the ability to listen for the recorded shortcut! Unfortunately not that many key combinations are available at the moment, as well as mouse support currently not out, but I'll try to get a PR for all of this up and running and then we can have people with different OS' and setups try things and give feedback.

Finally, this currently only works on Dimension, but I'm adding support for Scalar, Riot's default integration manager, as soon as everything else is working. I've already got the dev env set up :)

skylord123 commented 5 years ago

@anoadragon453 I love you. Keep it up.

anoadragon453 commented 5 years ago

@skylord123 Thanks!

Alright, I think I've finally figured out a nice solution to the keymapping problem. This whole time I've been trying to record items using a library like Mousetrap so that we can have a nice visual of the current keys that are being pressed. The problem with this is that Mousetrap doesn't have a nice display for many items. Nevertheless, as it had a common mapping with iohook (the rawcode property), I went ahead and implemented it, and life was pretty good.

However, the fact that we couldn't properly display all combinations of keys being pressed still bugged me. If you pressed Enter, it would be ' ' as being the current shortcut, and while one could do a lot of regex to make this nicer, it's not an elegant solution.

Cue KeyboardEvent, which is a thing that exists, and actually generalizes to all OS' and international keyboards (as best as it can)! While this is great for display, it unfortunately doesn't have any codes that are in common with what iohook is expecting.

Then I realized, well we could just listen with both at the same time? And since this option is device-specific, if the rawcode of, say, Enter ends up being 65383 and 87 on Windows, that's alright as you'd be setting it per-device.

So, the solution is now, when setting a shortcut, use iohook and this fancy in-browser event listener to both display and get the OS/keyboard-specific keycodes for any given shortcut, even supporting fancy ones like Ctrl+ K + Enter + Backspace (for some reason), which wasn't possible with the Mousetrap setup.

As for security, this would require a function allowing iohook to listen to all keyboard shortcuts. Fortunately, I think we can limit this to working only when Riot is in-focus, cutting it off if the user decides to go out of focus. This is controlled in the main Electron process, and thus shouldn't be override-able as a result of an Electron XSS to the best of my knowledge (but someone please correct me if otherwise).

In the process of making Mousetrap work, I also submitted another PR to iohook, but now it looks like that functionality won't be needed after all. Oh well, hope it's useful to someone :)

anoadragon453 commented 5 years ago

So the combination of KeyboardEvent and iohook was exactly what was needed, and we can now represent pretty much all keyboard combinations now (mouse to likely following in a separate PR :)

Still some bugs to work out, as well as getting this working on Scalar, but as a treat here's a video of it in action! The Missing Translation is due to, well, a translation for this new text not existing yet. And yes, visuals to hopefully be improved before the final version :)

https://streamable.com/ekqf5

BloodyIron commented 5 years ago

lol mrtestmanman nice! :D @anoadragon453

Thanks for your help!

anoadragon453 commented 5 years ago

image

So after a short weekend break, Push to Talk is now completely working! I've done the following today:

This is all instead of doing so only when the PTT option is enabled, or when Riot starts, which are both unnecessary. Of course, the option will still enable/disable mid-call

There is justified concern of the shortcut recording feature, which records all keypresses when setting a shortcut to see which keys you're pressing. This feature is now always disabled when Riot is not in focus, even in the event of an XSS.

This one was important, obviously. Riot no longer crashes when you try to open a window, such as when sending a file or clicking on the taskbar icon :)

The final step is just to get the setup working with Scalar, which should happen shortly after I get a working Scalar setup going locally (about 80% of the way there, but need a few more pieces). After that will be the PR phase :)

BloodyIron commented 5 years ago

@anoadragon453

How does this impact those that want the option to have the mic wide open too? Is that just the behaviour when PTT is disabled in settings?

anoadragon453 commented 5 years ago

If PTT is disabled, it will be the same functionality is it is now.

anoadragon453 commented 5 years ago

Alright! Most of the day was spent on getting Scalar up and running, but now it finally is, hooray!

So will just need to hack on Scalar a little bit tomorrow and everything should be good to go!

anoadragon453 commented 5 years ago

Ok, got Scalar working really nicely, and PRs for everything are now live!

In addition, I managed to make the following work while polishing everything up:

Mouse buttons might work (testing wanted!), but I'm envisioning we'll need to do some UI work for that (little picture of a mouse, maybe?).

Look forward to seeing these land on the /develop branch!

BloodyIron commented 5 years ago

Can we add to the future to-do list the Riot icon have a mute symbol over it when PTT is muted or something? Although, not sure how that would work for multiple room jitsi chat... uhhhh?

Also, when we get the audible cue for PTT mute/unmute sound, can we make that two sounds, and user-customizable and/or pre-defined in the Riot config.json? (for those of us who built it ourselves :D )

Also, thanks a tonne for this! Yay! Can't wait to use it! :DDD

anoadragon453 commented 5 years ago

Can we add to the future to-do list the Riot icon have a mute symbol over it when PTT is muted or something?

Sure

Also, when we get the audible cue for PTT mute/unmute sound, can we make that two sounds, and user-customizable and/or pre-defined in the Riot config.json? (for those of us who built it ourselves :D )

That might be part of a big redactor which allows to customize all the different sounds of Riot, but I'll see how difficult it is to just add custom sounds for PTT for now.

Also, thanks a tonne for this! Yay!

yw!

anoadragon453 commented 5 years ago

Mainlining has been momentarily stuck behind the concern that our build machines currently run MacOS (which can compile electron builds for Linux/MacOS/Windows), but compiling native node modules (of which iohook is) requires actually building on Linx/MacOS/Windows boxes. We can obviously accomplish this via VMs, but the infrastructure needs to be modified first.

In the meantime, as iohook provides pre-compiled binaries, the decision is to just use these until we have our own systems set up. The binaries are downloaded automatically during the typically npm install phase, so the actual build process will not change at all, unless you want to compile iohook on your own (instructions for which should be provided soon).

Other than that, the issues with the Settings UI have been fixed, and here is the current implementation :)

image

BloodyIron commented 5 years ago

@anoadragon453 any particular advantage to having iohook be part of the build flow vs binaries?

jryans commented 5 years ago

any particular advantage to having iohook be part of the build flow vs binaries?

I would assume the main advantages are around trust and future maintenance. When you rely on pre-compiled dependencies, you are trusting the dependency author to build them without anything malicious added in. On the maintenance side, they may also stop providing them for new Node / Electron versions. So in the long run, it's best to build from source, especially for security focused software like Riot.

anoadragon453 commented 5 years ago

Pretty much this.

BloodyIron commented 5 years ago

right! any chance iohook will get baked into the main repo here then? (for those of us looking to do our own builds of riot? ;P )