indigo-astronomy / indigo

INDIGO is a system of standards and frameworks for multiplatform and distributed astronomy software development designed to scale with your needs.
http://www.indigo-astronomy.org
Other
152 stars 70 forks source link

Support of multiple network interfaces #319

Closed kkretzschmar closed 4 years ago

kkretzschmar commented 4 years ago

Hi Peter, we discussed this already. When changing the network interface of my laptop, e.g. from LAN to WLAN by plugging-out the LAN cable, the client server connection is still kept alive until the server recognizes after a while (typically 15 min) that there is no traffic over this connection anymore and closes it.

The problem is that during this time it is not possible to reconnect to the server, or even worse it is not possible to stop the indigo.service (don't know why). I have to kill the indigo process manually ...

The problem can also be reproduced with the Indigo Control Panel.

The only way out of this situation is to actively disconnect from the server when the network interface is closed, e.g. via the zeroconf service browser which knows when a interface has been closed. But this means also that a client instance must be able to hold more than one connection to an indigo server, for each network interface (which has its own IP address) one connection thread. This is currently not possible, when I try to connect with a new client instance I get a DUPLICATE error.

What do you think.

Thanks, Klaus

polakovic commented 4 years ago

Rumen, any idea how to fix it?

rumengb commented 4 years ago

I understand the issue but so far I do not think there is much that can be done about it...

First because the connection is routed through one interface and one IP address. So it is considered as a totally new network connection. On the other hand indigo does not have any timeouts once the connection is established. communication can be sporadic and there may no be chatter for minutes and this is OK.

So there is no way to know when this is done... i tried the same with ssh... it is the same if there is no timeout set. Maybe we can do something with mdns to make it expire sooner but still there is no way to find a radical solution.

the only way is to disconnect and reconnect again by hand... if you have a better idea let me know.

kkretzschmar commented 4 years ago

I agree on server side there is nothing that you can do, and I also would not recommend to change the timeout threshold, since as far as I know this would affect all tcp connections. It is a global setting.

I think the situation can be improved on the client side alone. If we create as many client instances as we have network interfaces, then when switching from LAN to WLAN would mean to create a new connection on the WLAN interface while the LAN connection is blocked. The stalled LAN connection will then either be stopped by the server after 15 min, or it reconnects after plugging in the LAN cable again.

The problem is that the indigo client lib has a global array of available servers "indigo_available_servers" which prevents that two client instances can connect to the same server. If this array would not be global, but each client holds its own array, we could establish those multiple connections.

Of course, this will make a client implementation more complex, you have to take care, that the "open client" is always used.

What do you think?

rumengb commented 4 years ago

I am afraid this approach will not work :( We have a service discovery and it will see the service twice, as it does now, and will report duplicate services and will use only one. This is done for a reason. Devices are attached to the bus and are identified by their names. So the same device will have its name twice on the bus, and there is no network routing information in INDIGO. Network routing has nothing to do with INDIGO, this is done on the OS level. Another problem that I see is that if you send property change request over both connections it will be delivered to the device twice and will almost always result in error. For example second start exposure will fail as there is an exposure in progress. And the last thing that concerns me is that I can not control which physical interface will be used by the OS to establish the connection. It is the OS routing layer that does that, the clients have barely any control over this. If you want such "highly available" connections and dynamic connection routing you can configure some sort of interface bonding on OS level but not on application.

Can you give me an example which applications work like this? I tried SSH and it does not work as you describe...

Maybe I am missing something?

Rumen

On Tue, Apr 14, 2020, 8:09 PM kkretzschmar notifications@github.com wrote:

I agree on server side there is nothing that you can do, and I also would not recommend to change the timeout threshold, since as far as I know this would affect all tcp connections. It is a global setting.

I think the situation can be improved on the client side alone. If we create as many client instances as we have network interfaces, then when switching from LAN to WLAN would mean to create a new connection on the WLAN interface while the LAN connection is blocked. The stalled LAN connection will then either be stopped by the server after 15 min, or it reconnects after plugging in the LAN cable again.

The problem is that the indigo client lib has a global array of available servers "indigo_available_servers" which prevents that two client instances can connect to the same server. If this array would not be global, but each client hold its own array, we could establish those multiple connections.

Of course, this will make a client implementation more complex, you have to take care, that the "open client" is always used.

What do you think?

— You are receiving this because you were assigned. Reply to this email directly, view it on GitHub https://github.com/indigo-astronomy/indigo/issues/319#issuecomment-613566138, or unsubscribe https://github.com/notifications/unsubscribe-auth/AE5EZBKL2KGUCDVXIHIJZMDRMSKDPANCNFSM4MGPUT6A .

kkretzschmar commented 4 years ago

I am a little bit more optimistic. I think it is possible to bind a socket to a specific IP address of the client host, and it is possible to get the IP address of the network devices via the multicast DNS (DNS-SD) API. In addition the DNS-SD API informs you if a network device was attached or detached, via the interface index. But I have to try it.

Klaus

rumengb commented 4 years ago

If dns-sd has such event this will solve the problem 100%, and it will not be complicated to handle.

On Wed, Apr 15, 2020 at 11:57 AM kkretzschmar notifications@github.com wrote:

I am a little bit more optimistic. I think it is possible to bind a socket to a specific IP address of the client host, and it is possible to get the IP address of the network devices via the multicast DNS (DNS-SD) API. In addition the DNS-SD API informs you if a network device was attached or detached, via the interface index. But I have to try it.

Klaus

— You are receiving this because you were assigned. Reply to this email directly, view it on GitHub https://github.com/indigo-astronomy/indigo/issues/319#issuecomment-613911323, or unsubscribe https://github.com/notifications/unsubscribe-auth/AE5EZBKYEPQMABYIOF2EMI3RMVZIHANCNFSM4MGPUT6A .

kkretzschmar commented 4 years ago

Hi Rumen, I have a running prototype now, which requires only small changes on the indigo client sources. My client application can now handle several interfaces, each have a separate connection to the indigo server, and the indigo server has a worker thread for each interface (ip address). The server doesnt see that all worker threads belong to the same host, since the server only holds connections to ip addresses. Everything works as expected except uploading image files to the clients.

The problem is that after an image was successfully created by the ccd device, it is sent to all clients, also to clients that didn't send the exposure request. So when one client is disconnected then the server is blocked, since it is waiting to finish uploading also to the disconnected client. You can simply reproduce this with two Laptops by

  1. connecting to the same indigo server.
  2. disabling network access of one Laptop
  3. requesting a new image with the connected Laptop

Is this an expected behavior? Would it be possible with the current indigo architecture to restrict sending the image only to the client which sent the exposure request?

Thanks, Klaus

rumengb commented 4 years ago

Hi Klaus, I will answer later I a couple of days. I am out without a computer. But I would love to see your changes :) Meanwhile Peter can answer if he is more connected :)

I decided to take a break for a couple of days after 2 months prisoner at home...

On Sat, May 9, 2020, 7:06 PM kkretzschmar notifications@github.com wrote:

Hi Rumen, I have a running prototype now, which requires only small changes on the indigo client sources. My client application can now handle several interfaces, each have a separate connection to the indigo server, and the indigo server has a worker thread for each interface (ip address). The server doesnt see that all worker threads belong to the same host, since the server only holds connections to ip addresses. Everything works as expected except uploading image files to the clients.

The problem is that after an image was successfully created by the ccd device, it is sent to all clients, also to clients that didn't send the exposure request. So when one client is disconnected then the server is blocked, since it is waiting to finish uploading also to the disconnected client. You can simply reproduce this with two Laptops by

  1. connecting to the same indigo server.
  2. disabling network access of one Laptop
  3. requesting a new image with the connected Laptop

Is this an expected behavior? Would it be possible with the current indigo architecture to restrict sending the image only to the client which sent the exposure request?

Thanks, Klaus

— You are receiving this because you were assigned. Reply to this email directly, view it on GitHub https://github.com/indigo-astronomy/indigo/issues/319#issuecomment-626198112, or unsubscribe https://github.com/notifications/unsubscribe-auth/AE5EZBIA5R3C5FTRMT3XO3TRQV5PTANCNFSM4MGPUT6A .

kkretzschmar commented 4 years ago

No rush, enjoy your free time ;) ... and stay safe!

polakovic commented 4 years ago

The problem is that after an image was successfully created by the ccd device, it is sent to all clients, also to clients that didn't send the exposure request.

This should be controlled by indigo_enable_blob() call. Images are uploaded only to clients which did "subscribe" to receive blobs of particular property. The safest solution is to use INDIGO_ENABLE_BLOB_URL mode, in such case only the URL is sent and client can download the image itself.

kkretzschmar commented 4 years ago

Great, using the ENABLE_BLOB_URL mode will solve my problem. It is even better than the current upload implementation since it enables an asynchronous upload , if I understood correctly. Thanks!

polakovic commented 4 years ago

Yes, exactly.

kkretzschmar commented 4 years ago

Seems to work with only minimal changes on indigo client side (see my pull request). Please review and let me know if I have missed something.

With these changes I am now able to hold many connections (with different interface indices) to the same server. When a network interface is disconnected (e.g. plug-out LAN cable) I detach the corresponding client from the indigo bus and attach with the a client in my internal list of clients holding an open connection.

I am using the DNSServiceBrowse API to get the information when a service was added or removed.

kkretzschmar commented 4 years ago

Thanks, it works.