conversejs / converse.js

Web-based XMPP/Jabber chat client written in JavaScript
http://conversejs.org
Mozilla Public License 2.0
3.05k stars 763 forks source link

Audio/video calls integration #447

Open nodiscc opened 8 years ago

nodiscc commented 8 years ago

Hello, I just found an old draft for an Audio/Video support request I was planning to submit here. @jcbrand is this something you'd like to have in converse.js? I did a quick research on possible implementations. Pasting the contents below


A quick search for FOSS a/v web chat solutions returned https://www.jsxc.org/ (XMPP + voice/video) and https://github.com/Rantanen/mumble-web (mumble). I didn't find the sources for Firefox Hello (client and server), which works.

The go-to solution for audio/video over XMPP seems to be jingle. Much related to https://github.com/jcbrand/converse.js/issues/161

Relevant links I found:

XEP-0166: Jingle - This specification defines an XMPP protocol extension for initiating and managing peer-to-peer media sessions between two XMPP entities in a way that is interoperable with existing Internet standards. The protocol provides a pluggable model that enables the core session management semantics (compatible with SIP) to be used for a wide variety of application types (e.g., voice chat, video chat, file transfer) and with a wide variety of transport methods (e.g., TCP, UDP, ICE, application-specific transports).

XEP-0266: Codecs for Jingle Audio - This document describes implementation considerations related to audio codecs for use in Jingle RTP sessions.

XMPP Technologies: Jingle – The XMPP Standards Foundation

XEP-0167: Jingle RTP Sessions - This specification defines a Jingle application type for negotiating one or more sessions that use the Real-time Transport Protocol (RTP) to exchange media such as voice or video. The application type includes a straightforward mapping to Session Description Protocol (SDP) for interworking with SIP media endpoints.

Combining SIP and XMPP | OnSIP -> sixpac Discussion Archive - Date Index

OTR does not seem to be able to encrypt audio/video streams, but there's ZRTP: XEP-0262: Use of ZRTP in Jingle RTP Sessions. Also:

[cryptography] Jingle and Otr - If you know the face of the person you are talking to, you can surely tell if the right person is speaking the right SAS, which makes the methods used by OTR overkill for video.


Want to back this issue? Post a bounty on it! We accept bounties via Bountysource.

jcbrand commented 8 years ago

Thanks for doing some research into this.

I think the approach I would take is to use Strophe.jingle (or something similar) with WebRTC.

Would be a nice feature to have. Ideally a company that needs this feature would fund the development for it.

bes1002t commented 8 years ago

I found this libjingle adaption for javascript https://github.com/otalk/jingle.js libjingle is a library that makes it possible to use audio and videochat over xmpp and clients like pidging are already using libjingle. So maybe this could help you to implement that feature.

sebastienhasa commented 4 years ago

Any news?

jcbrand commented 4 years ago

There are no plans to add audio/video support, but pull requests are welcome if someone wants to take up the challenge.

bes1002t commented 4 years ago

maybe now webRTC would be a good technology to use. But yes it needs some effort to implement.

free-14 commented 3 years ago

If there is a person who can do this and embed the code for audio / video as in pix-art messages, he is ready to sponsor this project.

licaon-kter commented 3 years ago

@free-14 you mean Conversations, since Pix-Art is just a fork, right? Also... that's Java on Android... This is Javascript.

deleolajide commented 3 years ago

I have a working conversejs plugin for jingle audio/video calls that works with the jingle implementation in Conversations. However, it has outstanding issues and requires a make-over to remove the out-dated jquery xml parsing. I am not free to work on it at the moment, but for anyone who has the time, space and ability, here is the link

https://github.com/igniterealtime/pade/blob/version-1.xx/docs/inverse/plugins/jinglecalls.js

free-14 commented 3 years ago

@free-14 you mean Conversations, since Pix-Art is just a fork, right? Also... that's Java on Android... This is Javascript. Yes, you noticed correctly. It's just that I'm already used to the design and additional features of pix-art. )

free-14 commented 3 years ago

There is another ready-made solution that works well in nextcloud - Jsxc. It is open source. I think this decision is not bad when combined.

jcbrand commented 2 years ago

Implementation of this feature is now part of our 2022 Google Summer of Code application. See here: https://wiki.xmpp.org/web/Google_Summer_of_Code_2022#Support_Audio/Video_calls_in_Converse

I will be acting as mentor for this project.

I've done a bit of research and want to write down my thoughts and findings as a resource for myself and whoever will be doing the implementation.

Note: For now I'm just going to do a brain dump, and I'll likely update this comment multiple times over an extended period.

Introduction

Audio/Video calls are done outside of XMPP via WebRTC. XMPP (in the form of the Jingle protocol extension) is merely used as a means to initiate and terminate an audio/video session. In plain language, this means that Jingle is used to call the other person (AFAIK that's why it's called "Jingle", like ringtone), and that person can then pick up, or reject the call. If they pick up the call, either person has the option to end the call at any time.

Making the call and accepting, rejecting or ending a call are all session management operations that are done by sending XMPP stanzas between two clients.

UI components

So, from a UX perspective, we need the following UI elements:

Specifications

The main XEPs related to this feature are:

How to start:

We need to create an new plugin called jingle (eventually it could be split up into two pugins, one without UI elements for @converse/headless and another for the UI).

Then I would start by writing tests, and I would start with [XEP-0353: Jingle Message Initiation](XEP-0353: Jingle Message Initiation). I would write a test where the user receives an incoming Jingle (XEP-0353) message and then checks that the right things happen (e.g. that the user is informed of the incoming call, and if they accept the call, that the right stanzas are sent out).

Multiple tests could be written here, to go through the various scenarios explained in the XEPs.

I would first write all the tests for the Jingle stanza traffic (and the associated UI elements) before worrying about WebRTC and actually setting up the audio/video calls.

Once the tests (and resultant implementations) are done, we can look at actually setting up the WebRTC session in a popup.

Other implementations that can be used as reference

Check the license before just blindly copying code. MIT, BSD and MPL are ok, but GPL isn't. When reusing big chunks of code, make sure to include copyright and licencing information.

poVoq commented 2 years ago

JSXC has an example implementation that also works for group calls and with an SFU. They wrote down an overview here: https://www.jsxc.org/blog/2021/08/31/A-group-call-proposal.html

jcbrand commented 2 years ago

@poVoq thanks, group calls are outside of the scope of this issue and the GSoC project. We can make a separate feature request for it.

jcbrand commented 1 year ago

@PawBud

As discussed in our last video call, here is the information you need regarding the next step of showing information about calls in the chat history.

The chat history is rendered via the component converse-message-history that's defined here: https://github.com/conversejs/converse.js/blob/0cfe2a18afb334066abc2935020cc3d026334bd9/src/shared/chat/message-history.js

On line 34, you'll see it triggers a hook via (api.hook) and the name of the hook is in a variable called template_hook which comes from the message model (this.model.get('template_hook')) which will be rendered by the renderMessage function.

See here: https://github.com/conversejs/converse.js/blob/0cfe2a18afb334066abc2935020cc3d026334bd9/src/shared/chat/message-history.js#L34

What this code does, is it allows plugins to control which template is rendered for a particular message. To do so, you have to make sure that there is an attribute template_hook stored on the message object. The value of that attribute is then the name of the hook that gets triggered. You can then register a handler for that hook, which returns a template that renders the message.

So what you can do, is when you parse an incoming message to see if it's a Jingle message (in parseJingleMessage), you can add another attribute to the attrs object, namely template_hook: getJingleTemplate.

That attribute will then be stored on the message object that gets created after parsing has been completed, and so when renderMessage for that message is called in the MessageHistory component, the hook getJingleTemplate will be triggered.

The above will handle incoming messages.

For outgoing messages, you set the template_hook attribute when you create the message object, here: https://github.com/conversejs/converse.js/blob/bbb985e008f3abf43c7aaa3a7ec7c501648edcf3/src/plugins/jingle/chat-header-notification.js#L46

You can then create a handler for that hook which returns a template to render the jingle message in the chat history. This template you need to create yourself. You can look at the general message template for inspiration: https://github.com/conversejs/converse.js/blob/0cfe2a18afb334066abc2935020cc3d026334bd9/src/shared/chat/templates/message.js

Your Jingle message template will have some similarities with regards to the markup it renders.

Once all of this is done, you should be able to see your Jingle messages rendered inside the chat history.