jerryschen31 / learn

Scratch repo used for learning stuff
GNU General Public License v3.0
0 stars 0 forks source link

2023-01-24 WebRTC #7

Open jerryschen31 opened 1 year ago

jerryschen31 commented 1 year ago

WebRTC signalling server https://youtu.be/SsN4gl_wV_8

What is a STUN and TURN server https://youtu.be/4dLJmZOcWFc

jerryschen31 commented 1 year ago

ICE framework - pass ICE server (TURN server?) URL when establishing a RTCPeerConnection https://www.youtube.com/watch?v=_4FkRf9utSc

jerryschen31 commented 1 year ago

adapter.js - allows WebRTC to work across browsers

routing - SFU (central routing server)

https://youtu.be/AjLFvHuG0cE ICE candidates - a client's available IP addresses and ports 1) local IP address - only useful inside LAN 2) public IP address (provided by STUN server, uses device's NAT to get a device's public IP) 3) TURN server (relay server)

IP candidates provided by STUN server are known as srflx (server reflexive) and prflx (peer reflexive) Often firewalls don't allow direct peer-to-peer connections -> need to use a 3rd party relay server (TURN server) for passing data between clients

Lab https://youtu.be/_FlzKsEVRK4

jerryschen31 commented 1 year ago

[TODO] Learn Express ... const express = require('express'); const app = express(); let http = require('http').Server(app);

const port = process.env.PORT || 3000;

//static hosting of public folder app.use(express.static('public'));

http.listen(port, () => { console.log('listening on ', port); });

jerryschen31 commented 1 year ago

[TODO] Research STUN (ICE) servers - Mozilla, Google, ...? Free? const iceServers = { 'iceServer': [ {'urls': 'stun:stun.services.mozilla.com'}, {'urls': 'stun:stun.l.google.com:19302'} ] }

jerryschen31 commented 1 year ago

[TODO] learn socket.io - understand socket.join

jerryschen31 commented 1 year ago

Lab https://youtu.be/_FlzKsEVRK4 https://youtu.be/zpLQjr2FsRg https://youtu.be/rSdA0xgGl38 https://youtu.be/5m6sNAtNL-8

jerryschen31 commented 1 year ago

[TODO] review basic HTML DOM manipulation, and basic JQuery https://www.w3schools.com/jsref/met_document_createelement.asp https://www.w3schools.com/jsref/prop_document_documenturi.asp

jerryschen31 commented 1 year ago

https://www.w3schools.com/jsref/obj_window.asp

jerryschen31 commented 1 year ago

Go through basic samples from WebRTC org https://webrtc.github.io/samples/ https://github.com/webrtc/samples

jerryschen31 commented 1 year ago

Some more examples https://github.com/muaz-khan/WebRTC-Experiment

jerryschen31 commented 1 year ago

Understand addTrack() https://developer.mozilla.org/en-US/docs/Web/API/RTCPeerConnection/addTrack

jerryschen31 commented 1 year ago

[TODO] I need to understand socket.io and rtcPeerConnection better.

jerryschen31 commented 1 year ago

addTrack https://developer.mozilla.org/en-US/docs/Web/API/RTCPeerConnection/addTrack

jerryschen31 commented 1 year ago

signaling and video calling - official webrtc documentation https://developer.mozilla.org/en-US/docs/Web/API/WebRTC_API/Signaling_and_video_calling#starting_negotiation

jerryschen31 commented 1 year ago

https://developer.mozilla.org/en-US/docs/Glossary/ICE ICE (Interactive Connectivity Establishment) is a framework used by WebRTC (among other technologies) for connecting two peers, regardless of network topology (usually for audio and video chat). This protocol lets two peers find and establish a connection with one another even though they may both be using Network Address Translator (NAT) to share a global IP address with other devices on their respective local networks.

The framework algorithm looks for the lowest-latency path for connecting the two peers, trying these options in order:

Direct UDP connection (In this case—and only this case—a STUN server is used to find the network-facing address of a peer) Direct TCP connection, via the HTTP port Direct TCP connection, via the HTTPS port Indirect connection via a relay/TURN server (if a direct connection fails, e.g., if one peer is behind a firewall that blocks NAT traversal) See also WebRTC, the principal web-related protocol which uses ICE WebRTC protocols RFC 8445, the IETF specification for ICE RTCIceCandidate, the interface representing an ICE candidate

jerryschen31 commented 1 year ago

This is a great overview https://developer.mozilla.org/en-US/docs/Web/API/WebRTC_API/Protocols

jerryschen31 commented 1 year ago

Interactive Connectivity Establishment (ICE) is a framework to allow your web browser to connect with peers... needs to bypass firewalls that would prevent opening connections, give you a unique address if like most situations your device doesn't have a public IP address, and relay data through a server if your router doesn't allow you to directly connect with peers. ICE uses STUN and/or TURN servers to accomplish this, as described below.

Session Traversal Utilities for NAT (STUN) is a protocol to discover your public address and determine any restrictions in your router that would prevent a direct connection with a peer.

Network Address Translation (NAT) is used to give your device a public IP address. A router will have a public IP address and every device connected to the router will have a private IP address. Requests will be translated from the device's private IP to the router's public IP with a unique port. That way you don't need a unique public IP for each device but can still be discovered on the Internet. [Q: BUT WHAT ABOUT TRAFFIC FROM OUTSIDE BACK TO THE PRIVATE DEVICE. HOW DOES ROUTER KNOW WHICH PRIVATE DEVICE TO ROUTE BACK TO?]

Understanding NAT and Firewalls https://www.onsip.com/voip-resources/voip-fundamentals/what-are-nat-and-firewall-traversals#:~:text=A%20router%20uses%20NAT%20to,to%20reach%20the%20proper%20destination.

Firewalls act more as gatekeepers, whereas NAT acts more like a translator. Both technologies are intended to add extra security to your local network. By maintaining private IP addresses for each of your devices and inspecting all incoming and outgoing packets, these technologies make it difficult for outside parties to illegally hack into or access your network.

Some routers will have restrictions on who can connect to devices on the network. This can mean that even though we have the public IP address found by the STUN server, not anyone can create a connection. In this situation we need to use TURN. Some routers using NAT employ a restriction called 'Symmetric NAT'. This means the router will only accept connections from peers you've previously connected to.

Traversal Using Relays around NAT (TURN) is meant to bypass the Symmetric NAT restriction by opening a connection with a TURN server and relaying all information through that server. You would create a connection with a TURN server and tell all peers to send packets to the server which will then be forwarded to you. This obviously comes with some overhead so it is only used if there are no other alternatives.

Session Description Protocol (SDP) is a standard for describing the multimedia content of the connection such as resolution, formats, codecs, encryption, etc. so that both peers can understand each other once the data is transferring. This is, in essence, the metadata describing the content and not the media content itself. Specification: RFC 4566: SDP: Session Description Protocol

jerryschen31 commented 1 year ago

https://developer.mozilla.org/en-US/docs/Web/API/WebRTC_API/Connectivity Unfortunately, WebRTC can't create connections without some sort of server in the middle. We call this the signal channel or signaling service. It's any sort of channel of communication to exchange information before setting up a connection, whether by email, postcard, or a carrier pigeon. It's up to you.

The information we need to exchange is the Offer and Answer which just contains the SDP mentioned below. Peer A who will be the initiator of the connection, will create an Offer. They will then send this offer to Peer B using the chosen signal channel. Peer B will receive the Offer from the signal channel and create an Answer. They will then send this back to Peer A along the signal channel.

The configuration of an endpoint on a WebRTC connection is called a session description. The description includes information about the (1) kind of media being sent (audio, video, data,...), its format, the transfer protocol being used, the endpoint's IP address and port, and other information needed to describe a media transfer endpoint. This information is exchanged and stored using Session Description Protocol (SDP); if you want details on the format of SDP data, you can find it in RFC 2327

When a user starts a WebRTC call to another user, a special description is created called an offer. This description includes all the information about the caller's proposed configuration for the call. The recipient then responds with an answer, which is a description of their end of the call. In this way, both devices share with one another the information needed in order to exchange media data. This exchange is handled using Interactive Connectivity Establishment (ICE), a protocol which lets two devices use an intermediary to exchange offers and answers even if the two devices are separated by Network Address Translation (NAT).

Each peer, then, keeps two descriptions on hand: the local description, describing itself, and the remote description, describing the other end of the call.

Steps

  1. The caller captures local Media via MediaDevices.getUserMedia
  2. The caller creates RTCPeerConnection and calls RTCPeerConnection.addTrack() (Since addStream is deprecating)
  3. The caller calls RTCPeerConnection.createOffer() to create an offer.
  4. The caller calls RTCPeerConnection.setLocalDescription() to set that offer as the local description (that is, the description of the local end of the connection).
  5. After setLocalDescription(), the caller asks STUN servers to generate the ice candidates
  6. The caller uses the signaling server to transmit the offer to the intended receiver of the call.
  7. The recipient receives the offer and calls RTCPeerConnection.setRemoteDescription() to record it as the remote description (the description of the other end of the connection).
  8. The recipient does any setup it needs to do for its end of the call: capture its local media, and attach each media tracks into the peer connection via RTCPeerConnection.addTrack()
  9. The recipient then creates an answer by calling RTCPeerConnection.createAnswer().
  10. The recipient calls RTCPeerConnection.setLocalDescription(), passing in the created answer, to set the answer as its local description. The recipient now knows the configuration of both ends of the connection.
  11. The recipient uses the signaling server to send the answer to the caller.
  12. The caller receives the answer.
  13. The caller calls RTCPeerConnection.setRemoteDescription() to set the answer as the remote description for its end of the call. It now knows the configuration of both peers.
  14. Media begins to flow as configured.

https://stackoverflow.com/questions/21069983/what-are-ice-candidates-and-how-do-the-peer-connection-choose-between-them Typically ice candidate provides the information about the ipaddress and port from where the data is going to be exchanged.

It's format is something like follows

a=candidate:1 1 UDP 2130706431 192.168.1.102 1816 typ host

here UDP specifies the protocol to be used, the typ host specifies which type of ice candidates it is, host means the candidates is generated within the firewall. If you use wireshark to monitor the traffic then you can see the ports that are used for data transfer are same as the one present in ice-candidates.

Another type is relay , which denotes this candidates can be used when communication is to be done outside the firewall.

jerryschen31 commented 1 year ago

image

jerryschen31 commented 1 year ago

selenium for testing web applications https://www.selenium.dev/

jerryschen31 commented 1 year ago

mocha for testing JS applications https://mochajs.org/

jerryschen31 commented 1 year ago

[TODO] need to understand sockets better

jerryschen31 commented 1 year ago

RTCPeerConnection https://developer.mozilla.org/en-US/docs/Web/API/WebRTC_API/Signaling_and_video_calling#starting_negotiation

Once the caller has created its RTCPeerConnection, created a media stream, and added its tracks to the connection as shown in Starting a call, the browser will deliver a negotiationneeded event to the RTCPeerConnection to indicate that it's ready to begin negotiation with the other peer. Here's our code for handling the negotiationneeded event:

function handleNegotiationNeededEvent() {
  myPeerConnection
    .createOffer()
    .then((offer) => myPeerConnection.setLocalDescription(offer))
    .then(() => {
      sendToServer({
        name: myUsername,
        target: targetUsername,
        type: "video-offer",
        sdp: myPeerConnection.localDescription,
      });
    })
    .catch(reportError);
}
jerryschen31 commented 1 year ago

https://developer.mozilla.org/en-US/docs/Web/API/RTCPeerConnection/negotiationneeded_event A negotiationneeded event is sent to the RTCPeerConnection when negotiation of the connection through the signaling channel is required. This occurs both during the initial setup of the connection as well as any time a change to the communication environment requires reconfiguring the connection.

The negotiationneeded event is first dispatched to the RTCPeerConnection when media is first added to the connection. This starts the process of ICE negotiation by instructing your code to begin exchanging ICE candidates through the signaling server. See Signaling transaction flow for a description of the signaling process that begins with a negotiationneeded event.

pc.onnegotiationneeded = (ev) => {
  pc.createOffer()
  .then((offer) => pc.setLocalDescription(offer))
  .then(() => sendSignalingMessage({
    type: "video-offer",
    sdp: pc.localDescription
  }))
  .catch((err) => {
    /* handle error */
  });
};

When createOffer() succeeds (fulfilling the promise), we pass the created offer information into myPeerConnection.setLocalDescription(), which configures the connection and media configuration state for the caller's end of the connection.

After creating the offer, the local end is configured by calling RTCPeerConnection.setLocalDescription(); then a signaling message is created and sent to the remote peer through the signaling server, to share that offer with the other peer. The other peer should recognize this message and follow up by creating its own RTCPeerConnection, setting the remote description with setRemoteDescription(), and then creating an answer to send back to the offering peer.

Once setLocalDescription()'s fulfillment handler has run, the ICE agent begins sending icecandidate events to the RTCPeerConnection, one for each potential configuration it discovers. Our handler for the icecandidate event is responsible for transmitting the candidates to the other peer. (Candidates = public IPs for this client or IP of STUN server)

The ICE negotiation process involves each peer sending candidates to the other, repeatedly, until it runs out of potential ways it can support the RTCPeerConnection's media transport needs. Since ICE doesn't know about your signaling server, your code handles transmission of each candidate in your handler for the icecandidate event.

Session negotiation Now that we've started negotiation with the other peer and have transmitted an offer, let's look at what happens on the callee's side of the connection for a while. The callee receives the offer and calls handleVideoOfferMsg() function to process it. Let's see how the callee handles the "video-offer" message.

When the offer arrives, the callee's handleVideoOfferMsg() function is called with the "video-offer" message that was received. This function needs to do two things. First, it needs to create its own RTCPeerConnection and add the tracks containing the audio and video from its microphone and webcam to that. Second, it needs to process the received offer, constructing and sending its answer.

The createPeerConnection() function is used by both the caller and the callee to construct their RTCPeerConnection objects, their respective ends of the WebRTC connection. It's invoked by invite() when the caller tries to start a call, and by handleVideoOfferMsg() when the callee receives an offer message from the caller.