jackbboy / csipsimple

Automatically exported from code.google.com/p/csipsimple
0 stars 0 forks source link

Registration reliability tracking issue #1136

Closed GoogleCodeExporter closed 9 years ago

GoogleCodeExporter commented 9 years ago
There are two groups of problems:
1. Time-to-time CSS looses registration, then it starts registering again by 
itself
2. CSS may fall into a state, where it doesn't send REGISTER requests anymore. 
Manual push is required

First group of cases - the reason perhaps is irregularity in sending REGISTER 
requests. If "expired" time set to, let say, 10 min, it may send next request 
in 12 min or more (sometimes the delay could be an hour or even more). During 
the time, when it's late, CSS is obviously unregistered from VoIP provider and 
can't get incoming calls. At the same time it could send REGISTER requests 
earlier, then configured. E.g. it may send next request in 1 or 2 min after the 
last one, making unnecessary registrations... See example posted in <a 
href="http://code.google.com/p/csipsimple/issues/detail?id=81#c86">Comment 
#86</a>.

Second group of cases - I've noticed that CSS may suddenly just stop sending 
REGISTER requests at all. At that time other accounts (I keep several accounts 
active at a time) continue to make registrations as usual...

The message, when it's stuck, could be:
Error while registering -
Unauthorized

If I catch that (for example, in a morning I may discover that it's stuck for 
the whole night) I may deactivate the account and then activate it again. It 
registers immediately (there is no "Unauthorized" problem, of cause). That's 
very dangerous scenario - I think it's registered with VoIP provider, while 
it's not and I can't get any calls, of cause :(

The worst scenario though is when CSS gets stuck showing, that it's registered, 
while it's not. See my <a 
href="http://code.google.com/p/csipsimple/issues/detail?id=1066#c5">comment 
#5</a>. Fortunately it happens in very rare occasions.

Note: in all cases cell phone is completely idle (no any user interaction with 
the phone). Watching registration status was done by checking VoIP provider's 
status page and SIP server log.

I'm sorry, but it's very frustrating to see how unreliable is SIP registration 
process in CSS :(

What steps will reproduce the problem?
1. Watch for registration status with VoIP provider
2. Push CSS to make registration, if it's stuck with "Unauthorized" error  
3. Check, if CSS is really registered, or it's only shows, that it is, while 
it's not (check VoIP provider's status to see that)

What is the expected output? What do you see instead?
CSS is registered with VoIP provider(s) all the time

What version of the product are you using? On what operating system?
Latest market build, 0.02-03 r944
Android 2.1 

Please provide any additional information below.

Original issue reported on code.google.com by yok...@gmail.com on 9 Jul 2011 at 8:56

GoogleCodeExporter commented 9 years ago
[deleted comment]
GoogleCodeExporter commented 9 years ago
One more guess...

Both 3 & 5 cases can happen
when the network(3G) doesn't respond to CSS's registration request.
The no response situation can happen when the network is unstable or
registration request is sent before the network becomes stable 
(between Wifi Off and 3G On).

In this case, 
CSS waits forever without retry even though network becomes alive...
(when other internet applications work...)

I guess, from this phenomenon, CSS does not have a retry function 
when the response time becomes too much long.
This kind of network error should be handled in real network situation... 

I hope this simple idea can help..

Thanks..

Original comment by hansg...@gmail.com on 15 Mar 2012 at 11:17

GoogleCodeExporter commented 9 years ago
Exactly what happens to me hansg.

Original comment by kro...@gmail.com on 15 Mar 2012 at 8:29

GoogleCodeExporter commented 9 years ago
reporting one more phenomenon.

When CSS is connected through Wifi and I'm watching the account screen.
Then, an account becomes Inactivated immediately if I close Wifi connection.
This is normal operation...

When CSS is connected through 3G and I'm watching the account list screen.
Then, if I close 3G connection, the account's status is NOT changed
even though the connection icon disappears in the display bar at the top of the 
screen. 

This is a strange behavior ONE.

At this time, if I touch the account icon, it becomes inactivated as expected.
Then, if touch the account icon again to make it be registered, 
the account's status is changed to connected immediately even though 
the 3G connection is closed now... 

This is a strange behavio TWO.

from strange behavior ONE and TWO, 
I guess ... 
1. There's a hole in 3G connection detection procedure.
2. because of 1, some state machines in CSS are not synchronized in specific 
situation...
3. And this is the reason why CSS looks like it does not retry registration
when request time is over. Actually, at the moment, CSS THINKS that 3G is 
connected, so it just waits just like 2 above.

Later, I found that in both 3 & 5 error cases, if I toggle the account 
activation, 
the strange behavior TWO happened..
So, there can be a connection 3 & 5 error cases and strange behaviors...

Thanks..

Original comment by hansg...@gmail.com on 16 Mar 2012 at 4:11

GoogleCodeExporter commented 9 years ago
Thanks a lot for the feedback. 
Sorry if I don't reply immediately but it helps me a lot to reproduce the issue 
on my side :).
It also gives me good clues about what could be the problem. As you guessed, I 
think that's something with the detection of 3G. I recently switched my main 
test device to 4.0 and I think that the behavior of 3G detection is different 
than it was in 2.x (btw, even the stock sip app changed). I naively followed 
these changes, but it probably breaks things for older device (the stock sip 
app doesn't have this problem since they always only target one android version 
;) ).

Original comment by r3gis...@gmail.com on 16 Mar 2012 at 8:29

GoogleCodeExporter commented 9 years ago
One more thing...

The malfunction of 3 & 5 might be because of CSS's system function call crash...
The reason why I guess like this is ... 

when it is in off line and no retry... 
strange phenomena happen.

1. I can't stop the CSS application from task manager... 
It says because of something is running .. I have to terminate forcefully.. 

2. If I plug-in a ear-phone jack, the mp3 sound sould be out through 
earphone... 
Then, in this crash situation, it doesn't work. the sound comes out through 
speaker.

from these phenomena, i guess some system function called by CSS is not 
returned or 
crashed... because ??? I don't know... 
anyway, it is related with SOUND function .. I guess.. 

Thanks.

Original comment by hansg...@gmail.com on 23 Mar 2012 at 6:43

GoogleCodeExporter commented 9 years ago
The problem persists.

I have CSS configured for wifi only.  I found that putting wifi down then up 
caused the client to re-register successfully.

I have installed nightly 1352, and enabled logging.  I'll report back any 
status.

Regards,
Brian

Original comment by bpwinfrey@gmail.com on 27 Mar 2012 at 1:19

GoogleCodeExporter commented 9 years ago
Log from build 1352.  
https://docs.google.com/open?id=0B9F0Ko2xTdaAanVUUDNpaFVSOWF5ZjJfUWVPT1EtQQ

updated nightly build.

Original comment by bpwinfrey@gmail.com on 27 Mar 2012 at 2:25

GoogleCodeExporter commented 9 years ago
Issue 1647 has been merged into this issue.

Original comment by r3gis...@gmail.com on 7 Apr 2012 at 2:17

GoogleCodeExporter commented 9 years ago
version nightly built 1410

setting:  wifi sleep policy -> never

observed fact: screen turns off wifi and CSS stay connected for a while.  It 
could be many hours before connection cut off while other times it's a matter 
of an hour or so.  When awaken wifi would re-establish connection then CSS as 
well.  Sometimes during a long chat wifi connection icon would disapppear and a 
pause ensues til wifi re-establish connection.  My question:  if wifi signal 
drops and phone is in sleep state would wifi be able to re-establish connection 
like when I am using the phone?  At this time I am even sure if the cutoff 
during sleep is a wifi drop problem or CSS induced problem.  

Original comment by chin.bil...@gmail.com on 22 Apr 2012 at 2:38

GoogleCodeExporter commented 9 years ago
Is it a device issue?

I have two android phone. One is LG Optimus V, which is working beautifully 
with csipsimple. The other one is HTC Droid Incredible. Installation is fine, 
but registration with PBXes would be lost from time to time without any clue. I 
happened to notice that there are two other phones with similar problems are 
HTC. So, could it be a device specific issue?

Original comment by jianh...@gmail.com on 28 Apr 2012 at 12:37

GoogleCodeExporter commented 9 years ago
I think the problem is really the disconnection of wifi, not the fault of 
csipsimple.  For some reason my custom ROM seems not capable to latch onto wifi 
signal.  I have tested it and conclude it's not csipsimple's fault

Original comment by chin.bil...@gmail.com on 1 May 2012 at 11:09

GoogleCodeExporter commented 9 years ago
I have noticed CSS losing registration as well, usually after a long time 
asleep. 

I'm using the nightlies, and I use TLS connection to an Asterisk server, from 
behind a NAT router. I thought it was the phone sleeping, so I tried an 
experiment: I had Tasker wake the phone with dim screen for 3 seconds every 10 
minutes. The registration stayed on all day yesterday (when I was occasionally 
using the phone for checking email, etc), but overnight, with the tasker task 
running, it lost registration. When I woke the phone in the morning it had no 
CSS notification icon, and when I ran CSS it re-registered.

Wifi sleep = never, CSS Lock Wifi = on, phone was on charger all night.

This makes me believe that it is NOT the phone's sleep state that causes this 
loss of registration. 

Original comment by dlake...@gmail.com on 20 Jul 2012 at 3:12

GoogleCodeExporter commented 9 years ago
Also note, my Galaxy Tab 2 running the same nightlies and with CSS Wifi Lock = 
Yes, "Keep Wifi On During Sleep" = Never actually stays registered all night. 
It's running Android 4.0.4, kernel 3.0.8 with a bunch of RAM (~ 700 MB total, 
270 MB avail) vs the Exhibit II running Android 2.3.5 kernel 2.6.35.7 (~ 400MB 
total, ~ 90MB avail).

So it does seem like there's something wrong with the kernel or hardware 
drivers or resources available on the phone, whereas the tablet is behaving 
better.

Is there a way to have Tasker check CSS registration status so I could have it 
re-start the CSS user interface and hope to get re-registration as a work 
around until something gets fixed?

Original comment by dlake...@gmail.com on 20 Jul 2012 at 3:36

GoogleCodeExporter commented 9 years ago
This problem still exists on the Nexus One.  I believe it's related to stun, so 
turning it off may help (if you can - I can't).

Original comment by kro...@gmail.com on 20 Jul 2012 at 9:01

GoogleCodeExporter commented 9 years ago
I do not use STUN so I don't believe that it is related to this problem.

Original comment by dlake...@gmail.com on 27 Jul 2012 at 1:18

GoogleCodeExporter commented 9 years ago
Well I've done several things, but I think the one that seems to help the most 
is to use wifi "high perfs lock". I also set the registration expiration down 
to 250 seconds and keep alives down around 90 seconds. This seems to use a bit 
more battery but it makes my Samsung Exhibit II hold the registration for many 
hours now. I suspect that without the wifi high perfs lock the qualification 
"pings" from Asterisk are not received reliably or in sufficient time, and 
Asterisk drops the registration.

I am going to increase registration time to 450 seconds and try to bump TCP 
keepalives up as well maintaining high perf lock and see what happens.

Original comment by dlake...@gmail.com on 8 Aug 2012 at 3:49

GoogleCodeExporter commented 9 years ago
For the record :
Issue 1697 analysis reveal that some servers (in this case pbxes.org one) 
behaves weird when re-transmission is made by sip client. 
By default the application will retransmit packets after 500ms if no reply. In 
case phone is asleep and network latency is big you can reach this case and the 
registration will not be able to complete and when you come in csipsimple the 
account is marked as registration failure (red). 
A workaround is to change the retransmission value to some higher one (for 
example 1000ms). It will be default for pbxes.org wizard now. 
Let me know if you observe same thing on other sip server/provider.

Original comment by r3gis...@gmail.com on 17 Aug 2012 at 7:59

GoogleCodeExporter commented 9 years ago
Issue 1427 has been merged into this issue.

Original comment by r3gis...@gmail.com on 17 Aug 2012 at 8:08

GoogleCodeExporter commented 9 years ago
csipsimple@googlecode.com writes:

Hello, where is this setting?  I could not find it in preferences.

Original comment by bpwinfrey@gmail.com on 18 Aug 2012 at 3:29

GoogleCodeExporter commented 9 years ago
@bpwindfrey : 
The setting is in nightly build version (that you can download and install from 
http://nightlies.csipsimple.com/trunk/)
Once updated, follow instructions at the end of comment 5 here : 
http://code.google.com/p/csipsimple/issues/detail?id=1697#c5

Original comment by r3gis...@gmail.com on 19 Aug 2012 at 7:40

GoogleCodeExporter commented 9 years ago
This fixed it for me! you rock r3gis

Original comment by jhartle...@gmail.com on 13 Sep 2012 at 6:47

GoogleCodeExporter commented 9 years ago
Issue 1215 has been merged into this issue.

Original comment by r3gis...@gmail.com on 18 Sep 2012 at 9:33

GoogleCodeExporter commented 9 years ago
[deleted comment]
GoogleCodeExporter commented 9 years ago
@r3gis:
It seems that another server, EuroTELEFON should have T1 set for 1000ms as 
well. I have tested it and it also solved the problem in case of connection 
with sip.eurotelefon.eu. 

Original comment by Tadeusz...@gmail.com on 4 Dec 2012 at 8:56

GoogleCodeExporter commented 9 years ago
[deleted comment]
GoogleCodeExporter commented 9 years ago
I'm on the Galaxy Nexus w/4.2.2 (GSM) and I'm seeing this problem with 
Vitelity.net and not the Asterisk server at work.  It happens only in wifi, 
which is the only time I have SIP turned on.  I usually see it once a day, 
sometimes every other day.

Original comment by kevin.la...@gmail.com on 12 Mar 2013 at 2:01

GoogleCodeExporter commented 9 years ago
I'm on Nexus 4 (4.2.2).  Had a rock steady connection (single SIP account) with 
the latest release from Google Play.  But the nightly build downloaded on the 
23rd April 2013, seems to be experiencing a similar issue.

It seems to lose registration after a certain amount of time.  Clicking on the 
Account List icon shows in red (usually) that the account was Unregistered or 
encountered a registration error.  Manually deactivating and reactivating 
results in a clean connection & registration.  

@r3gis - is there anything I can do or any further details I can provide to 
help debug?  

Original comment by pre...@gmail.com on 28 Apr 2013 at 6:39

GoogleCodeExporter commented 9 years ago
A few more details to add:
   - whilst using CSipSimple from another internet connection (WiFi over ADSL) CSipSimple registers with the SIP provider immediately on connection with the internet.  
   - but after a 5 or so minute call, it fell into the 'unregistered' state. I reregistered it manually with no problems.  
   - it seems to be holding registration a bit more reliably (more like the stable version on the Play store did with my other internet WiFi via Wimax router)

Original comment by pre...@gmail.com on 30 Apr 2013 at 6:21

GoogleCodeExporter commented 9 years ago
[deleted comment]
GoogleCodeExporter commented 9 years ago
[deleted comment]
GoogleCodeExporter commented 9 years ago
[deleted comment]
GoogleCodeExporter commented 9 years ago
I'm not sure if it's related to this issue, but I get "service unavailable" 
error messages for all sip accounts if I switch my wifi off, allow my mobile 
data connection to take over, wait for SIP registration and then reactivate my 
wifi connection. The same thing happens if I leave my wifi area and come back. 
I tried with both Callcentric and Iptel, and upon reactivating wifi, both 
accounts appear to time out after about 30 seconds of "registering." I'm 
running the latest stable version from the Play Store, which is reported as 
1.02.00R2330 on a Huawei MyTouch Q. In case it helps, my wifi router is a 
D-Link DIR-601 running OpenWRT 12.09, which usually gives me a very stable and 
reliable connection for most other applications. The only work-around I found 
is to disconnect CSS, deactivate wifi, reactivate wifi, wait 30 seconds to a 
minute for the connection to be fully established and all applications to see 
it and then rerun CSS, which appears to reset something to the point where it 
can detect and use the wifi connection. I tried going into settings and backing 
out without disconnecting or cycling the wifi radio, but the accounts still 
hang on "registering," this time for only about 10 to 15 seconds before timing 
out, as if there is no connection to the internet. However, other applications 
such as e-mail apps are getting a connection. Best I can figure is that there 
is something missing in the wifi detection in CSS,. Either that, or the MyTouch 
Q is doing something different from other devices when the wifi is connected.

Original comment by kyle4je...@gmail.com on 18 Nov 2013 at 5:17

GoogleCodeExporter commented 9 years ago
One further detail about my specific issue ...
It seems I don't need to disconnect and rerun CSS to register my accounts once 
wifi is connected. The easiest work-around on my device is to activate my 
mobile data connection prior to leaving my wifi area and to deactivate it just 
before reentering the wifi area. Failing to deactivate mobile data upon 
entering a wifi zone causes the registration errors/timeouts  I describe above 
for all accounts. If I then deactivate both mobile data and wifi and then 
reactivate wifi, my accounts register almost immediately without any further 
trouble. Hopefully this info will be helpful in tracking down this issue. 
Thanks for the great work.

Original comment by kyle4je...@gmail.com on 19 Nov 2013 at 1:36

GoogleCodeExporter commented 9 years ago
I run across this issue in my weird edge case where vpn failover takes long 
enough that registration gives us due to not being able to resolve the address.
I can easily fix this via a small script, but I need a little help to do it.
There should be a way I can kickstart a registration retry from the command 
line using am. For example, I am able to use the browser to download a file 
from the web like this:  am start -a android.intent.action.VIEW -n 
com.android.browser/.BrowserActivity -d http://URL

Could anybody tell me how I could force a registration attempt in CSS from the 
command line like this?

Original comment by grndcntr...@gmail.com on 23 Nov 2013 at 6:42

GoogleCodeExporter commented 9 years ago
So this gives me major grief on my Nexus 4 and Callcentric.  CSipSimple 
registers without a problem, but after some time (several minutes to hours) 
while on wifi the registration drops.  CSipSimple shows the registration as 
active, but no calls can be received, and the Callcentric website shows the 
extension as disconnected.  This happens every time, but the amount of time 
until it happens is variable.  My best guess is that it has to do with moving 
between different access points on the same network, but it's just a guess (I 
rearranged my network so that my home network only had one access point, and 
the issue didn't arise - but I've only tried this on one occasion so far).  
Would love ideas for a workaround or a way to troubleshoot this.

Original comment by ZacharyG...@gmail.com on 7 Feb 2014 at 11:08

GoogleCodeExporter commented 9 years ago
I too am having this problem and would love a solution. HTC1V on Voipfone
Steve

Original comment by stevegarner26 on 14 Feb 2014 at 8:58

GoogleCodeExporter commented 9 years ago
I see this issue dates back to 2011. I'll be brief as I don't have any new 
details to add that weren't covered by prior commenters. I'm using CSS with 
Callcentric, and the only unusual thing is that for the first 6 to 12 months I 
used CSS I didn't see a problem, but it was sometime after about 6 months ago 
when Callcentric added support for extensions and I set up a dedicated 
extension for CSS that I started noticing Callcentric's dashboard indicating 
CSS being unregistered. This could be coincidental, or an artifact that prior 
to Callcentric supporting multiple extensions, if you used multiple SIP 
clients, whichever one registered last, won, and you couldn't easily tell if 
one client had stopped registering.

What can I do to help debug this? I've enabled logging (its in the help menu, 
for those asking in prior comments).

Original comment by tme...@gmail.com on 21 Feb 2014 at 12:22

GoogleCodeExporter commented 9 years ago
With regard to Callcentric and CSS, I don't seem to have this problem (losing 
registration while CSS still shows registration is active) with other SIP apps 
on my Nexus 4.  Moreover, I don't have this problem when I access the same 
Callcentric extension through pbxes.org and access pbxes.org with CSS.  It 
seems to be an unusual interaction unique to the CSS-Callcentric combination.

Original comment by ZacharyG...@gmail.com on 21 Feb 2014 at 2:31

GoogleCodeExporter commented 9 years ago
I also suffer from this issue. I am certain it has nothing to do with WiFi 
availability or network problems. The SIP server in my cases resides on the 
local network and I have all the required logs (on the server).
This is basically what happens.

CSS stops registering for 1 hour but during this 1 hour it keeps sending the 
SUBSCRIBE event to the server for the account (every 4 minutes or so). So CSS 
is alive, the WiFi connection too but somehow CSS is lost and does not send a 
REGISTER. The server enforced the expiration to 60 seconds so it should retry 
after every minute but it does not.

After 1 hour CSS changes the local port and starts registering again. So there 
must be something wrong in CSS. I would like to help but I understand it is 
almost impossible to reproduce this issue. I have 3 tablets in chargers running 
CSS all the time with WiFi never sleep policy. All of them should be ringing 
for incoming calls but sometimes some of them does not due to the not 
REGISTERed issue. 

How can I help to solve this issue?

Original comment by jakubklo...@gmail.com on 10 Mar 2014 at 9:29

GoogleCodeExporter commented 9 years ago
I had pbxes.org disconnecting on wifi occasionally, after reading through the 
whole thread comment #121 & link is what fixed it for me. I increased T1 delay 
to 2000 ms. But even though it's now more reliable, I still had a disconnect 
when my wifi internet was having trouble yesterday.

I think the better solution would be for CSIPSimple to re-try registration at a 
larger interval (every 5 or 10 minutes to save battery) in case the internet 
route is down intermittently. Currently it goes into a dead-end state. There's 
no way to know unless the user turns on device & checks the status bar. Even 
then it's not too obvious if there's a lot of icons up there. A red icon or an 
audio alert would also help when troubleshooting disconnect/re-registeration 
issues. A red icon should be when it's setup to connect to a SIP server but 
can't. Missing icon should be when it's not configured to connect - no SIP 
account, not setup for wifi, 3G, etc or when app is not running.

Original comment by jayz...@gmail.com on 12 Mar 2014 at 7:41

GoogleCodeExporter commented 9 years ago
Increasing T1 to 1000 has solved it for me.

Many Thanks!

Original comment by stevegarner26 on 20 Mar 2014 at 10:41

GoogleCodeExporter commented 9 years ago
Thanks to the very detailed feedback and logs from Jakub, we were able to find 
a problem that might affect some devices and results in having re-registrations 
not made.
From first tests it appears to solve the issue, but if more people can test and 
feedback, it could be interesting.
To test, just upgrade to latests nightly build version : 
http://nightlies.csipsimple.com/trunk/

Original comment by r3gis...@gmail.com on 30 Mar 2014 at 6:12

GoogleCodeExporter commented 9 years ago
After some extensive testing with multiple routers, SIP accounts and all kind 
of different settings, I found a surprisingly simple root cause of the problem: 
CSS loses registration as soon as its main app activity is destroyed. After 
that no more registration attempts are made for some reason. So as a 
workaround, I try to make sure now that the CSS main app always keeps running 
in the background and is listed among the recent apps. Of course this is still 
not 100% reliable as Android may close apps by itself when memory is low.

Original comment by dominik....@gmail.com on 13 May 2014 at 1:04

GoogleCodeExporter commented 9 years ago
Testing shortly with Kyocera C5155 ("Rise"), Android 4.0.3, provider 
CallCentric.

However, I did want to offer one other option: It would be nice if there were a 
forcible re-register interval option. Basically, if no other workaround works, 
a setting which will (as long as the app is not in an active call) simply 
disconnect and re-register after a specific amount of idle time since last 
registration.

Original comment by tv@duh.org on 16 Jun 2014 at 1:36

GoogleCodeExporter commented 9 years ago
So far, so good; as of r2416, the registration held for more than a half hour, 
which is far longer than it was holding registration previously. Fingers 
crossed, this may fix it. Will report back perhaps in a couple days once I know 
for certain.

(I might still like to have a forcible re-registration interval as a backup, 
particularly when using with flaky public Wi-Fi.)

Original comment by tv@duh.org on 16 Jun 2014 at 2:38

GoogleCodeExporter commented 9 years ago
Yep, r2416 held its registration to Callcentric overnight (had a 
cookie-registered wget wrapper running all night to check). This looks fixed 
for my device/OS/provider combo.

Original comment by tv@duh.org on 16 Jun 2014 at 12:27

GoogleCodeExporter commented 9 years ago

Original comment by r3gis...@gmail.com on 22 Jun 2015 at 11:30