paulscherrerinstitute / StreamDevice

EPICS Driver for message based I/O
GNU General Public License v3.0
28 stars 42 forks source link

Connection redundancy (Feature) #77

Closed gfrn closed 2 years ago

gfrn commented 3 years ago

I'd like to know if there is a feature (or a plan to add it in the future) on StreamDevice that permits "falling back" to a secondary connection natively, in order to avoid "doubling" over already existing work.

As an example, imagining we have two clients reading voltage information (which is mirrored on both devices), is there a way to open a connection to both on startup (or later on, by demand) and alternate between these two connections/IP addresses in case one fails, automatically returning to the first one?

Thanks in advance

dirk-zimoch commented 3 years ago

Each StreamDevice record is connected to one "bus instance" at initialization. Usually (in 100% of all cases) that is an instance of asynOctet. So the easiest way to implement this feature would be in asyn driver. The are so called "ïnterpose" layers in asyn. They provide the standard asyn interface functions to the user (StreamDevice) but do not connect to hardware directly but to a "real" asyn port driver. Or two (or more) in your case. So I think it should be easy to write an "asynInterposeFallback" driver and configure it to use a number of real asyn dirvers. But this would not a StreamDevice feature. Here are some example interpose drivers I wrote: https://github.com/paulscherrerinstitute/asynInterposeDelay https://github.com/paulscherrerinstitute/asynInterposeEcho

dirk-zimoch commented 3 years ago

Ok, checked it and found that you cannot use interpose for it because that only works on a single low level port as well. But writing a separate asyn driver that does so is possible.

gfrn commented 3 years ago

Understandable, that is what I'll be doing for now. Thanks for the quickly reply!

dirk-zimoch commented 3 years ago

I hacked something together, but I did not expect it to take 500 lines to do so. I found that one problem is that connection loss is only detected on read. That means the last written message is lost without notice. And the IOC cannot find out if a server port comes back until it tries to connect which it only does when it wants to write something. But it writes to the failover port. So the primay port will not resonnect unless the secondary fails too. Anyway. I can send you my work as a starting point.

gfrn commented 3 years ago

That seems grand, I was working on my own solution as well for a while and I was a bit stuck on freeing users properly (in order to avoid pointing to invalid points in memory that were supposed to be event signals), so I guess that'll help quite a lot. Thanks once again!

Obs.: I'll test it further on actual hardware on Monday

gfrn commented 3 years ago

It seems to work great (with a few changes made, but nothing too excessive). Although some edge case tests need to be performed, it adopts the predicted behavior for disconnects, unconnected ports on startup and other situations.

@dirk-zimoch Thank you for the heads up and directions, could I make this code publicly available in the future, with all due credits (if it's not available somewhere else already)? If need be, I can also close this issue, considering this went beyond the scope of the StreamDevice repository

dirk-zimoch commented 2 years ago

Sorry, I missed your previous message. Yes, you can use and publish the code as you like. It is nowhere else yet.

gfrn commented 2 years ago

Thank you, Dirk, I'll make sure to retain all authoring information.