emorgado / bluecove

Automatically exported from code.google.com/p/bluecove
0 stars 0 forks source link

Problem handling multiple received connections on MS Stack #74

Open GoogleCodeExporter opened 9 years ago

GoogleCodeExporter commented 9 years ago
What steps will reproduce the problem?
1. Create a Thread that listens to incoming SPP connections and delegates 
their handling to other Threads;
2. Create a Thread that periodically starts inquiries (start inquiry, 
process the results, sleep for some seconds, perhaps half a minute, then 
restarts the procedure);
3. Create a phone client that opens SPP connections to the server, 
exchanges some data (server sends some bytes, then receives some, then 
sends some more, and so on until it's done), closes the connection and 
start the process all over again after some arbitrary time;
4. Run the client on two or three phones concurrently trying to connect to 
the server.

What is the expected output? What do you see instead?
The server part is expected to handle the incoming connections or cleanly 
drop the ones it can't handle at the moment, and it does that for a while. 
However, after a while the communication on one of the clients running 
fails during the data exchange and after that no connection is received on 
the server ever again. The last seen devices remain being seen by the 
inquiries for a while, then nothing is ever seen on the BT interface again. 
Everything seems to crumble until the server is restarted or the dongle is 
manually removed and inserted on the USB port again. Sometimes only the 
second one works.

What BlueCove version are you using (include build number for SNAPSHOT)? On 
what operating system and jvm? Is this 64-bit or 32-bit OS and jvm?
I'm using bluecove-2.1.0.jar on Windows XP SP 3 32-bit running Sun JDK 
1.6.0 update 13. I'm also using a standard USB BT 2.0 dongle running on MS 
Stack. I have not tried any SNAPSHOTs yet.

Please provide any additional information below.
Please use "Attach a file" when uploading stack traces or other big files!

I've run the application with debug on BlueCove enabled and while the 
server is working I get the following:

1) For the inquiry, I get the prints attached in correct_inquiry.txt.

2) For the accepted connections, I get the prints attached in 
correct_accepted.txt.

After a while, during one of these cycles where the server is available to 
accept connections (it doesn't work while it's inquiring), I get:

18:01:57.406 connection accepted
      intelbth.cpp:1009
18:01:57.406 socket[3116] getpeeraddress
      intelbth.cpp:1260
18:01:57.406 connection open, open now 1

com.intel.bluetooth.RemoteDeviceHelper$RemoteDeviceWithExtendedInfo.addConn
ection(RemoteDeviceHelper.java:85)
18:01:57.406 socket[3116] getpeeraddress
      intelbth.cpp:1260
18:01:57.406 socket[3224] accept
      intelbth.cpp:997
18:01:57.421 socket[3116] send(int)
      intelbth.cpp:1128
18:01:57.421 socket[3116] recv()
      intelbth.cpp:1030

And the server connections hang from there on. The corresponding phone 
reports that it sent it's data and is waiting for the server's reply, but 
it never gets the server's reply (nor does the server print anything else 
on the matter). At this point, no more phones are able to connect to the 
server again, including the one that hung.

After that, when the inquiry starts on the other Thread it keeps returning 
some of the phones (the prints are similar to what I described earlier, 
except not all phones usually appear), but turning the bluetooth on the 
phones off has no effect on the results of the inquiries (none of them is 
paired with the computer). After some more time the phones start 
disappearing from the results and I keep getting the following for the 
inquiries:

18:07:57.921 runDeviceInquiry, duration=12
      intelbth.cpp:238
18:07:57.921 deviceInquiryStartedCallback

com.intel.bluetooth.DeviceInquiryThread.deviceInquiryStartedCallback(Device
InquiryThread.java:124)
18:07:57.921 startInquiry return true

com.intel.bluetooth.DeviceInquiryThread.startInquiry(DeviceInquiryThread.ja
va:90)
18:08:10.734 WSALookupServiceBegin error [10108] No such service is known. 
The service cannot be found in the specified name space.
      intelbth.cpp:304
18:08:10.734 runDeviceInquiry ends

com.intel.bluetooth.DeviceInquiryThread.run(DeviceInquiryThread.java:115)

At this point it doesn't matter what phones have BT on or off, or if they 
are trying to connect to the server or just visible, nothing works until 
the server is killed and started again (and sometimes even that doesn't 
seem to work and I need to manually remove and insert the dongle again).

Any thoughts as to why this happens and/or how to avoid such behavior?

Original issue reported on code.google.com by andre....@gmail.com on 8 Jun 2009 at 10:53

Attachments:

GoogleCodeExporter commented 9 years ago
Updating...

Today I ran the same test under version 2.1.1 SNAPSHOT.47 and it still 
happened. The 
"No such service is known. (...)" message seemed to appear faster than before, 
though 
it might be just coincidence.

I also tested it all under Linux to see what happened and it all seemed to work 
(since this seems like a concurrecy issue, it's hard to be sure, but I tried to 
break 
it unsuccessfully for a while and it stood up working).

I'm guessing the MS stack doesn't do well on some concurrent situation of 
inquiring 
while accepting connections and melts down. Should that be the case, I guess 
the 
question left would be if there's a way the application could detect that melt 
down 
(so an automated restart of the application can be provided, for example), or 
perhaps 
some way to detect it and restart it all from under the covers. In my case, the 
worst 
of all is that the server keeps working completely unaware it's unable to 
neither see 
nor hear anything anymore.

I don't think I'd be able to fix this by myself, but I'be happy to perform the 
tests 
on anything someone with the proper skills would do to fix it, though.

Thanks to anyone who had the patience to read it all. Sorry if I got way too 
verbose 
with it.

Original comment by andre....@gmail.com on 10 Jun 2009 at 2:16

GoogleCodeExporter commented 9 years ago
Lets try a short reply:
Can you use some other more stable USB Dongle on Windows, Like one from MS
Microsoft Wireless Transceiver for Bluetooth v3.0
Does it makes the differences?

Original comment by skarzhev...@gmail.com on 10 Jun 2009 at 4:30

GoogleCodeExporter commented 9 years ago
Hi skarzhevskyy,

Thanks for the reply.

Unfortunately I haven't seen one of these 3.0 devices around here to buy it and 
try 
it yet. But I did try more than one brand of 2.0 devices unsuccessfully. I'll 
try to 
put my hands on one of the devices on that list on the site, and I believe I 
can 
manage to try it on some widcomm device too, would that help on anything?

But are the devices really that platform dependent? As I said, on Bluez all the 
same 
devices worked fine, and the problem just rises on Windows XP when I put the 
server 
under stress, otherwise it works smoothly too.

Original comment by andre....@gmail.com on 10 Jun 2009 at 1:49

GoogleCodeExporter commented 9 years ago
I think that there are some problem in MS stack......

I also used D-Link DBT-120 under stress. It is not Bluetooth v2.0 + ERD but 
works
fine under stress on MS...

Original comment by skarzhev...@gmail.com on 10 Jun 2009 at 2:02

GoogleCodeExporter commented 9 years ago
I was actually hoping to use one of those extended range 2 Km devices. It was 
only
when I started having these problems that I started testing it on different 
devices
to get a big picture of what was the issue. I guess MS don't want a Windows XP
version of any more complex BT applications.

A couple of last questions:

1) By looks on the log after the stack starts failing, shouldn't the
deviceInquiryStartedCallback be called with false instead of true, and only 
after the
part where the WSALookupServiceBegin returns?

2) Would it be possible to perform some action after WSALookupServiceBegin 
fails with
that "No such service is known." message, maybe something like close everything 
up an
restart the BT service to make the stack return to a funcional state? Since at 
this
point nothing else is working anymore, I don't see why dropping it all would 
hurt anyone.

Original comment by andre....@gmail.com on 10 Jun 2009 at 2:57

GoogleCodeExporter commented 9 years ago
I don't know how to restart MS stack other then reboot computer.

And the function WSALookupServiceBegin returns only after discovery is 
completed to
the best of my knowledge.....  You can try to debug it and see. Pay attention 
to the
time in the log

Original comment by skarzhev...@gmail.com on 10 Jun 2009 at 3:22

GoogleCodeExporter commented 9 years ago
Actually, I meant the restart of the BlueCove parts. I noticed there's a 
shutdown
hook which I assume does some cleaning up before the VM exits, and since most 
of the
time just restarting the application seems to fix this inconsistent state, I 
figured
it might be because some of this cleaning (or perhaps the initilization of the
library again) actually restores the MS Stack to a usefull state. Otherwise I 
guess a
full computer restart would be needed.

If that's the case, I think cleaning up and starting all over would be 
preferable
over the blind and deaf state the application got stuck (completely unaware 
that it
is in it).

Original comment by andre....@gmail.com on 10 Jun 2009 at 6:01

GoogleCodeExporter commented 9 years ago
I figured it could be a workaround for the situation, since MS is busy trying 
to get
rid of XP and shove Vista down our throats and if they bothered to fix this, it
probably wouldn't be released for XP.

Original comment by andre....@gmail.com on 10 Jun 2009 at 6:05