robotpy / pynetworktables

Pure python implementation of the FRC NetworkTables protocol
Other
61 stars 30 forks source link

Client not auto-connecting to server in an... unusual environment #24

Closed Bobobalink closed 8 years ago

Bobobalink commented 8 years ago

I'm trying to use pynetworktables in a very strange threading environment (eventlet). Everything works fine, except that it doesn't connect to a NetworkTables server if the server is started after the client program. It connects properly and works fine if the server process is started first. I know this isn't a normal use-case, so I'm not sure whether or not to call it a bug. Is there a command I can call to force the library to check for a connection?

virtuald commented 8 years ago

There isn't really a way to tell the library to do that, as it's supposed to happen automatically. It happens on a separate thread somewhere...

I don't really know anything about eventlet, but perhaps you should use a debugger and step through the code and figure out why it's not connecting properly? Perhaps there's a weird network error being thrown somewhere unexpected?

Bobobalink commented 8 years ago

Unfortunately, the debugger disables the threading-related magic that makes it not auto-reconnect. Is there a single method that the separate thread is supposed to run to reconnect the server? If so, I can manually handle running the method regularly. There is nothing being logged after it disconnected, on either end.

virtuald commented 8 years ago

There's a reconnect loop somewhere... I think it just catches all exceptions? Eventlet might be throwing an unexpected exception somewhere and that's why it's failing?

Bobobalink commented 8 years ago

Where could I find said reconnect loop? Are there any logs for the errors it catches? I'd be happy to log them myself if I knew where to look...

virtuald commented 8 years ago

You can start here.

virtuald commented 8 years ago

I'm going to close this, as it's not really a supportable configuration... but, good luck figuring it out, and let me know if I can answer any more questions.

Bobobalink commented 8 years ago

What appears to be happening, after thorough debugging is that instead of throwing an IOError when connection is lost, it simply hangs forever. This leads to the awkward problem where the connectionLock is held forever, and I can't find a way to get around that problem, except nuking all of the NetworkTable infrastructure and starting over.

virtuald commented 8 years ago

That sounds like an eventlet bug then?

Bobobalink commented 8 years ago

I'm not sure, honestly. Somehow it's getting stuck in an inconsistent state. When I lose connection, the logs show that the state changes to DISCONNECTED_FROM_SERVER, and ConnectionListeners fire. However, the ClientConnectionLock never releases, which leaves everything in the awkward state of never being able to do anything, including reconnect automatically or manually. If I were to break out of ClientConnectionLock and reconnect anyway, what do I risk destroying or leaving in an inconsistent state? What objects should I handle to try to rebuild a NetworkTables connection after killing whatever is hanging on acquiring the ClientConnectionLock?

virtuald commented 8 years ago

Perhaps you could try to create a separate simple program that connects via a socket with eventlet, and see what happens when a connection is lost. Does it throw IOError? Does it hang? If you can reproduce the conditions with a simple example, it may be easier to deduce what error is occurring here.