Open tomsluyts opened 5 years ago
We've been having an issue with one faulty subscription (notification endpoint down) impacting the other active subscriptions, because of retries flooding the processor. Is there a way to set up Orion so subscriptions are more isolated?
How is the impact in other subscriptions? Maybe accumulated connection attempts while the connection failing endpoint timeout expires?
I think it might also make sense to have Orion have the option of no longer trying to deliver on certain subscriptions anymore once a certain number of tries have failed.
It would involve introducing a piece of state in the CB (the per-subscription fails counter) but it's a valuable idea anyway. I have created an issue for it here: https://github.com/telefonicaid/fiware-orion/issues/3541. Please feel free of adding feedback as comment in that issue.
It was indeed an accumulation of connection attempts that timeout, leave no more room for the other notifications.
I believe it is documented here: https://fiware-orion.readthedocs.io/en/master/admin/perf_tuning/index.html#outgoing-http-connection-timeout
Can you confirm that there is a possibility that other subscriptions can be affected?
Yes, it may impact globally to the Context Broker. I think the explanation provided in the documentation section you cite is quite precise.
Have you tried using -httpTimeout
to a short value (e.g. 1.5*N, being N the maximum time your system takes to establish connection when they are working)?
We've been having an issue with one faulty subscription (notification endpoint down) impacting the other active subscriptions, because of retries flooding the processor. Is there a way to set up Orion so subscriptions are more isolated?
I think it might also make sense to have Orion have the option of no longer trying to deliver on certain subscriptions anymore once a certain number of tries have failed.