DoubangoTelecom / telepresence

the open source SIP TelePresence system
https://www.doubango.org/conf-call/
148 stars 97 forks source link

***ERROR: function: "tsip_transport_layer_ws_cb()" #22

Open GoogleCodeExporter opened 8 years ago

GoogleCodeExporter commented 8 years ago
What steps will reproduce the problem?

1. Got an error "***ERROR: function: "tsip_transport_layer_ws_cb()" in console 
then try to connect..

What is the expected output? What do you see instead?

# ./telepresence
*******************************************************************
Copyright (C) 2013 Doubango Telecom <http://www.doubango.org>
PRODUCT: telepresence - the open source TelePresence System
HOME PAGE: http://conf-call.org
CODE SOURCE: https://code.google.com/p/telepresence/
LICENCE: GPLv3 or commercial(contact us)
VERSION: 2.1.0
'quit' to quit the application.
*******************************************************************

SSL is enabled :)
DTLS supported: yes
DTLS-SRTP supported: yes
*INFO: [TELEPRESENCE] [CFG] debug-audio-loopback = no
*INFO: [TELEPRESENCE] [CFG] accept-sip-reg = no
*INFO: [TELEPRESENCE] [CFG] transport = udp;*;20060;*
*INFO: [TELEPRESENCE] [CFG] transport = udp://*:20060@*
*INFO: [TELEPRESENCE] [CFG] transport = ws;*;20060;4
*INFO: [TELEPRESENCE] [CFG] transport = ws://*:20060@4
*INFO: [TELEPRESENCE] [CFG] transport = http;*;20065
*INFO: [TELEPRESENCE] [CFG] transport = https;*;20066
*INFO: [TELEPRESENCE] [CFG] rtp-symmetric-enabled = no
*INFO: [TELEPRESENCE] [CFG] ice-enabled = no
*INFO: [TELEPRESENCE] [CFG] icestun-enabled = no
*INFO: [TELEPRESENCE] [CFG] stun-server = 
stun.l.google.com;19302;stun-user@doubango.org;stun-password
*INFO: [TELEPRESENCE] [CFG] stun-server = stun.l.google.com;19302;-;-
*INFO: [TELEPRESENCE] [CFG] rtcp-mux-enabled = yes
*INFO: [TELEPRESENCE] [CFG] rtp-buffersize = 65535
*INFO: [TELEPRESENCE] [CFG] avpf-tail-length = 200;500
*INFO: [TELEPRESENCE] [CFG] codecs = g729;vp8
*INFO: 'g729' codec enabled but not supported
*INFO: 'vp8' codec enabled but not supported
*INFO: [TELEPRESENCE] [CFG] codec-opus-maxrates = 48000;48000
*INFO: [TELEPRESENCE] [CFG] congestion-ctrl-enabled = yes
*INFO: [TELEPRESENCE] [CFG] video-max-upload-bandwidth = 1890
*INFO: [TELEPRESENCE] [CFG] video-max-download-bandwidth = 1890
*INFO: [TELEPRESENCE] [CFG] video-motion-rank = 2
*INFO: [TELEPRESENCE] [CFG] video-fps = 15
*INFO: [TELEPRESENCE] [CFG] video-jb-enabled = yes
*INFO: [TELEPRESENCE] [CFG] video-zeroartifacts-enabled = yes
*INFO: [TELEPRESENCE] [CFG] video-mixed-size = vga
*INFO: [TELEPRESENCE] [CFG] video-speaker-par = 0:0
*INFO: [TELEPRESENCE] [CFG] video-listener-par = 1:1
*INFO: [TELEPRESENCE] [CFG] audio-channels = 1
*INFO: [TELEPRESENCE] [CFG] audio-bits-per-sample = 16
*INFO: [TELEPRESENCE] [CFG] audio-sample-rate = 8000
*INFO: [TELEPRESENCE] [CFG] audio-ptime = 20
*INFO: [TELEPRESENCE] [CFG] audio-volume = 1.0f
*INFO: [TELEPRESENCE] [CFG] audio-dim = 2d
*INFO: [TELEPRESENCE] [CFG] audio-max-latency = 200
*INFO: [TELEPRESENCE] [CFG] record = no
*INFO: [TELEPRESENCE] [CFG] record-file-ext = avi
*INFO: [TELEPRESENCE] [CFG] overlay-fonts-folder-path = 
./fonts/truetype/freefont
*INFO: [TELEPRESENCE] [CFG] overlay-copyright-text = Doubango Telecom
*INFO: [TELEPRESENCE] [CFG] overlay-copyright-fontsize = 12
*INFO: [TELEPRESENCE] [CFG] overlay-copyright-fontfile = FreeSerif.ttf
*INFO: [TELEPRESENCE] [CFG] overlay-speaker-name-enabled = yes
*INFO: [TELEPRESENCE] [CFG] overlay-speaker-name-fontsize = 16
*INFO: [TELEPRESENCE] [CFG] overlay-speaker-name-fontfile = FreeMonoBold.ttf
*INFO: [TELEPRESENCE] [CFG] overlay-speaker-jobtitle-enabled = yes
*INFO: [TELEPRESENCE] [CFG] overlay-watermark-image-path = 
./images/logo35x34.jpg
*INFO: [TELEPRESENCE] [CFG] ssl-private-key = /tmp/teleconf-xxx-key.pem
*INFO: [TELEPRESENCE] [CFG] ssl-public-key = /tmp/teleconf-xxx-key.pem
*INFO: [TELEPRESENCE] [CFG] ssl-ca = /tmp/teleconf-xxx-key.pem
*INFO: [TELEPRESENCE] [CFG] ssl-mutual-auth = no
*INFO: [TELEPRESENCE] [CFG] srtp-mode = optional
*INFO: [TELEPRESENCE] [CFG] srtp-type = sdes;dtls
*INFO: [TELEPRESENCE] [CFG] presentation-sharing-enabled = no
*INFO: [TELEPRESENCE] [CFG] presentation-sharing-process-local-port = 2083
*INFO: [TELEPRESENCE] [CFG] presentation-sharing-base-folder = ./presentations
*INFO: [TELEPRESENCE] [CFG] presentation-sharing-app = soffice
*INFO: [TELEPRESENCE] [CFG] Bridge with id ='10060' added
*INFO: [TELEPRESENCE] [CFG] Bridge with id ='10061' added
*INFO: [TELEPRESENCE] Presentation sharing not enabled
*INFO: Stack running in SERVER mode
*INFO: tsk_timer_manager_start
*INFO: Timer manager run()::enter
*INFO: TIMER MANAGER -- START
*INFO: SIP STACK::run -- START
*INFO: tnet_transport_prepare()
*INFO: pipeR fd=5
*INFO: Socket added[SIP transport]: fd=5, tail.count=1
*INFO: master fd=3
*INFO: Socket added[SIP transport]: fd=3, tail.count=2
*INFO: tnet_transport_prepare()
*INFO: Transport::run() - enter
*INFO: pipeR fd=7
*INFO: Socket added[SIP transport]: fd=7, tail.count=1
*INFO: master fd=4
*INFO: Socket added[SIP transport]: fd=4, tail.count=2
*INFO: Starting [SIP transport] server with IP {192.168.99.99} on port {20060} 
using fd {3} with type {2}...
*INFO: Transport::run() - enter
*INFO: SIP STACK -- START
*INFO: Starting [SIP transport] server with IP {192.168.99.99} on port {20060} 
using fd {4} with type {64}...
*INFO: ioctlt(4), len=0 returned zero or failed
*INFO: NETWORK EVENT FOR SERVER [SIP transport] -- FD_ACCEPT(fd=9)
*INFO: Socket added[SIP transport]: fd=9, tail.count=3
*INFO: NETWORK EVENT FOR SERVER [SIP transport] -- TNET_POLLOUT
*INFO: WebSocket Peer accepted/connected with fd = 9
*INFO: #1 peers in the 'SIP transport' transport
*INFO: WebSocket handshake message: GET / HTTP/1.1
Host: 192.168.0.21:20060
User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:26.0) Gecko/20100101 
Firefox/26.0
Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8
Accept-Language: ru-RU,ru;q=0.8,en-US;q=0.5,en;q=0.3
Accept-Encoding: gzip, deflate
Sec-WebSocket-Version: 13
Origin: http://xxx.yyy.zzz.com
Sec-WebSocket-Protocol: sip
Sec-WebSocket-Key: wmkDEUSqxVD7QuQSh+saXw==
Connection: keep-alive, Upgrade
Pragma: no-cache
Cache-Control: no-cache
Upgrade: websocket

*INFO: WebSocket Peer accepted/connected with fd = 9
*INFO: *** Stream Peer destroyed ***
*INFO: #0 peers in the 'SIP transport' transport
*INFO: #1 peers in the 'SIP transport' transport
***ERROR: function: "tsip_transport_layer_ws_cb()"
file: "src/transports/tsip_transport_layer.c"
line: "397"
MSG: WS handshaking not done yet
*INFO: Removing socket 9
*INFO: Socket to remove: fd=9, index=2, tail.count=3
*INFO: RemoveSocket(fd=9) has been requested but we are poll()ing the socket. 
ShutdownSocket(fd) called on the socket and we deferred the request.
*INFO: ShutdownSocket(fd=9)
*INFO: WebSocket Peer closed with fd = 9
*INFO: #0 peers in the 'SIP transport' transport
*INFO: PipeR event = 1
*INFO: *** Stream Peer destroyed ***
*INFO: NETWORK EVENT FOR SERVER [SIP transport] -- TNET_POLLHUP(9)
*INFO: Removing socket 9
*INFO: Socket to remove: fd=9, index=2, tail.count=3
*INFO: CloseSocket(9)
*INFO: PipeR event = 1
*INFO: WebSocket Peer closed with fd = 9
*INFO: WebSocket Peer closed with fd = 9

What version of the product are you using? On what operating system?

Centos 6.5 64-bit, Mozilla Firefox 26.0

Please provide any additional information below.

How can i figure this out? 

Original issue reported on code.google.com by cklorin...@gmail.com on 19 Dec 2013 at 10:03

GoogleCodeExporter commented 8 years ago
Getting the same issue on Centos 6.5 32bit

*INFO: WebSocket Peer accepted/connected with fd = 15
*INFO: *** Stream Peer destroyed ***
*INFO: #0 peers in the 'SIP transport' transport
*INFO: #1 peers in the 'SIP transport' transport
***ERROR: function: "tsip_transport_layer_ws_cb()"
file: "src/transports/tsip_transport_layer.c"
line: "397"
MSG: WS handshaking not done yet
*INFO: Removing socket 15
*INFO: Socket to remove: fd=15, index=2, tail.count=3
*INFO: RemoveSocket(fd=15) has been requested but we are poll()ing the socket. 
ShutdownSocket(fd) called on the socket and we deferred the request.
*INFO: ShutdownSocket(fd=15)
*INFO: WebSocket Peer closed with fd = 15
*INFO: #0 peers in the 'SIP transport' transport
*INFO: *** Stream Peer destroyed ***
*INFO: PipeR event = 1
*INFO: NETWORK EVENT FOR SERVER [SIP transport] -- TNET_POLLHUP(15)
*INFO: Removing socket 15
*INFO: Socket to remove: fd=15, index=2, tail.count=3
*INFO: CloseSocket(15)
*INFO: PipeR event = 1
*INFO: WebSocket Peer closed with fd = 15
*INFO: WebSocket Peer closed with fd = 15

Original comment by android....@gmail.com on 1 Jan 2014 at 4:36

GoogleCodeExporter commented 8 years ago
Same problem on Ubuntu 12.04 64bit, Firefox 26 and doubango version 2.6.0. With 
Chrome 31 all works fine.

Original comment by pburlov on 14 Jan 2014 at 1:01

Attachments:

GoogleCodeExporter commented 8 years ago
It happens sometimes also on Chrome :(

Original comment by patrik.t...@gmail.com on 25 Mar 2014 at 10:59

GoogleCodeExporter commented 8 years ago
After searching and trying I have found that it has something to do with ssl 
certicates

Original comment by patrik.t...@gmail.com on 29 Mar 2014 at 12:07

GoogleCodeExporter commented 8 years ago
So now what? 
Same problem here!!

Original comment by diogo...@gmail.com on 26 Jun 2014 at 3:03

GoogleCodeExporter commented 8 years ago
Nothing?

Original comment by di...@fluxoti.com on 7 Jul 2014 at 1:07

GoogleCodeExporter commented 8 years ago
Can you explain the reasoning behind why you believe it has something to do 
with SSL certificates?

Original comment by jtho...@silvervue.com on 9 Jul 2014 at 8:38

GoogleCodeExporter commented 8 years ago
So, I'm tracking it to something strange happening with the peer->ws structure 
for the currently connected peer.

I put a LOG print inside  tsip_transport_layer_ws_cb(const 
tnet_transport_event_t* e) to print 
the current status of the peer->ws.handshaking_done variable.

It seems like the socket connects, it detects the GET, everything checks out 
and handshaking_done is set, and then for the next packet that comes in, 
suddenly handshaking_done is '0' again. I print the address of the peer to 
prove it is the same structure both times.

Either it's some kind of race condition, or some condition happens in the 
middle that resets the structure:

This is the part of interest:
"*INFO: SETTING HANDSHAKING DONE on dc007830 0
*INFO: SET HANDSHAKING DONE on dc007830 1
*INFO: WebSocket Peer accepted/connected with fd = 18
*INFO: *** Stream Peer destroyed ***
*INFO: #0 peers in the 'SIP transport' transport
*INFO: #1 peers in the 'SIP transport' transport
*INFO: FIRST BYTES OF WEBSOCKET PACKET: � �  
*INFO: CHECKING HANDSHAKING DONE on dc007830 0
"

*INFO: NETWORK EVENT FOR SERVER [SIP transport] -- FD_ACCEPT(fd=18)
*INFO: Socket added[SIP transport]: fd=18, tail.count=3
*INFO: NETWORK EVENT FOR SERVER [SIP transport] -- TNET_POLLOUT
*INFO: WebSocket Peer accepted/connected with fd = 18
*INFO: #1 peers in the 'SIP transport' transport
*INFO: FIRST BYTES OF WEBSOCKET PACKET: G E T  
*INFO: CHECKING HANDSHAKING DONE on dc007830 0
*INFO: WebSocket handshake message: GET / HTTP/1.1
Upgrade: websocket
Connection: Upgrade
Host: 204.236.130.239:20060
Origin: http://192.168.1.72:3000
Sec-WebSocket-Protocol: sip
Pragma: no-cache
Cache-Control: no-cache
Sec-WebSocket-Key: GZ08ej3U8xLrnlkTt368/Q==
Sec-WebSocket-Version: 13
Sec-WebSocket-Extensions: permessage-deflate; client_max_window_bits, 
x-webkit-deflate-frame
User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10_9_2) AppleWebKit/537.36 
(KHTML, like Gecko) Chrome/35.0.1916.153 Safari/537.36

*INFO: IS WEBSOCKET PROTOCOL
*INFO: FOUND SIP
*INFO: SENDING SWITCH PROTOCOL RESPONSE
*INFO: SENDING RESPONSE
*INFO: SETTING HANDSHAKING DONE on dc007830 0
*INFO: SET HANDSHAKING DONE on dc007830 1
*INFO: WebSocket Peer accepted/connected with fd = 18
*INFO: *** Stream Peer destroyed ***
*INFO: #0 peers in the 'SIP transport' transport
*INFO: #1 peers in the 'SIP transport' transport
*INFO: FIRST BYTES OF WEBSOCKET PACKET: � �  
*INFO: CHECKING HANDSHAKING DONE on dc007830 0
***ERROR: function: "tsip_transport_layer_ws_cb()" 
file: "src/transports/tsip_transport_layer.c" 
line: "410" 
MSG: WS handshaking not done yet

Original comment by jtho...@silvervue.com on 9 Jul 2014 at 9:44

GoogleCodeExporter commented 8 years ago
So, I noticed the non-blocking versus blocking state machine handles the 
'accepted/connected' condition differently.

Maybe this is an error where a fix in one didn't quite make it to the other?

I've modified tsip_transport_layer.c in doubango with the following change:
Line 367:
{{{
                case event_accepted:
                case event_connected:
                        {
                                tsip_transport_stream_peer_t* peer;
                                // find peer
                                if((peer = tsip_transport_find_stream_peer_by_local_fd(transport, e->local_fd))){
                                        // If peer already exists.. do nothing :0
                                        return  0;
                                }
                                else {
                                        TSK_DEBUG_INFO("WebSocket Peer accepted/connected with fd = %d", e->local_fd);
                                        return tsip_transport_add_stream_peer(transport, e->local_fd, transport->type, tsk_true);
                                }
                        }
}}}

And now I don't get the intermittent connection failures because the 
handshaking_done flag now stays consistent to the overall state of the 
websocket.

If you look at line 175, it seems to perform some similar functionality where 
it searches for the peer, and if it finds it it doesn't create a new stream... 
but instead strangely frees the memory behind the peer.

Original comment by jtho...@silvervue.com on 9 Jul 2014 at 10:37

GoogleCodeExporter commented 8 years ago
is there some reason the above changes have not been accepted upstream and is 
there any other workaround for this issue ?

Original comment by aucam...@gmail.com on 4 Aug 2015 at 3:51