ChannelFinder / recsync

EPICS Record Synchronizor
Other
15 stars 25 forks source link

BUG: Recsync server crashes when CF server is down #11

Open asoderq opened 8 years ago

asoderq commented 8 years ago

I am running the ChannelFinder with Glassfish. I start both glassfish and recsync-server with systemd, on the same machine. The recsync-server service is configured to start after the glassfish service. However at this point the ChannelFinder service does not seem to be up, causing a crash in the recsync-server. It would be nice if this was handled and that the recsync-server tries to connect again a while later.

It works fine after restarting recsync-server service .

recsync-server-log.txt

mdavidsaver commented 8 years ago

Some more robustness would certainly be desirable.

Right now the handling of exceptions from plugins during commit() is strict. As recceiver works with deltas, dropping a single delta would leave a plugin out of sync, which would defeat the whole purpose.

Through the use of Deferred() a plugin's commit() is allowed take an arbitrarily long time to complete (although it should not do so by blocking). This should allow some delay and retry mechanism to be added to cfstore.

@shroffk fyi

asoderq commented 8 years ago

Also worth mentioning is that it crashes in a different way when http server is not even up yet, i.e. when glassfish is not even running.

shroffk commented 8 years ago

Just to clarify, the recsync server removes the cfstore support - the twistd server is still running right.

I did not know about the Deffered() method - I had considered a mechanism in which the client would keep trying to create a connection with an exponential backoff, but was thwarted by my lack of knowledge of multi-threaded programming in python. I guess this feature would be a good excuse to finally learn that.

mdavidsaver commented 8 years ago

I did not know about the Deffered() method

In this case the extent of the knowledge necessary is to wrapper with deferToThread() so that blocking calls are made on a worker thread.

http://twistedmatrix.com/documents/current/api/twisted.internet.threads.html#deferToThread

shroffk commented 8 years ago

So the 3 scenarios we want to handle

mskinner5278 commented 8 years ago
mskinner5278 commented 8 years ago

diagram1 Diagram for new cf-store design.

mskinner5278 commented 8 years ago
mdavidsaver commented 8 years ago

@alex-soderqvist FYI work is in progress to address this situation.

shroffk commented 6 years ago

@mdavidsaver @alex-soderqvist can we close this