eiffel-community / eiffel-gerrit-plugin

Eiffel Gerrit plugin listens on the changes in Gerrit and sends out Eiffel events
Apache License 2.0
3 stars 15 forks source link

Add retry support for REMReM requests #32

Closed Christoffer-Cortes closed 4 years ago

Christoffer-Cortes commented 4 years ago

Applicable Issues

Closes https://github.com/eiffel-community/eiffel-gerrit-plugin/issues/31

Description of the Change

This solution uses Spring-retry to send multiple request attempts to REMReM. A retry configuration class contains the RetryTemplate bean with the settings for the retry mechanism. This RetryTemplate is then used to wrap the send method and catches the RuntimeExceptions thrown by the sender when a http request fails. The RuntimeException is what triggers the retry.

TODO: circuit breaker pattern needs to be investigated and other questions that might arise.

Alternate Designs

Benefits

Possible Drawbacks

Sign-off

Developer's Certificate of Origin 1.1

By making a contribution to this project, I certify that:

(a) The contribution was created in whole or in part by me and I have the right to submit it under the open source license indicated in the file; or

(b) The contribution is based upon previous work that, to the best of my knowledge, is covered under an appropriate open source license and I have the right under that license to submit that work with modifications, whether created in whole or in part by me, under the same open source license (unless I am permitted to submit under a different license), as indicated in the file; or

(c) The contribution was provided directly to me by some other person who certified (a), (b) or (c) and I have not modified it.

(d) I understand and agree that this project and the contribution are public and that a record of the contribution (including all personal information I submit with it, including my sign-off) is maintained indefinitely and may be redistributed consistent with this project or the open source license(s) involved.

Signed-off-by: @Christoffer-Cortes

magnusbaeck commented 4 years ago

Doesn't Gerrit run the event listeners synchronously, i.e. blocking an event listener with a retry will block whatever Gerrit was doing that triggered the event (like a push operation). I'd expect that to be undesirable for many of not most users. Wouldn't it make more sense to have a limited in-memory queue like the rabbitmq plugin?

Christoffer-Cortes commented 4 years ago

Doesn't Gerrit run the event listeners synchronously, i.e. blocking an event listener with a retry will block whatever Gerrit was doing that triggered the event (like a push operation). I'd expect that to be undesirable for many of not most users. Wouldn't it make more sense to have a limited in-memory queue like the rabbitmq plugin?

I assumed that the listener was a threaded operation but perhaps it is not. We have some uncertainties around retries etc. even in REMReM. I'm not sure how it is handled in that service. I will take a look at this anyhow and see how we can make this safer and without blocking events.

magnusbaeck commented 4 years ago

I assumed that the listener was a threaded operation but perhaps it is not.

One of my colleagues suggested that event dispatching is non-blocking as of Gerrit 2.16 or 3.0 (a change Martin Fick supposedly was involved in), but it seems you want to support earlier releases as well. I haven't dug into the commit logs or release notes to verify that claim.

sselberg commented 4 years ago

I assumed that the listener was a threaded operation but perhaps it is not.

One of my colleagues suggested that event dispatching is non-blocking as of Gerrit 2.16 or 3.0 (a change Martin Fick supposedly was involved in), but it seems you want to support earlier releases as well. I haven't dug into the commit logs or release notes to verify that claim.

It's definitely blocking the event-bus in 2.14 and 2.15. That's why I implemented the listener -> queue -> publisher pattern in the rabbitmq plugin, I suggest you take a similar approach that does not depend on external resources to be able to handle the event. From a quick search I cannot find any changes in Gerrit core that matches these criteria. Could it be changes in the events plugin that was referenced?

Christoffer-Cortes commented 4 years ago

I assumed that the listener was a threaded operation but perhaps it is not.

One of my colleagues suggested that event dispatching is non-blocking as of Gerrit 2.16 or 3.0 (a change Martin Fick supposedly was involved in), but it seems you want to support earlier releases as well. I haven't dug into the commit logs or release notes to verify that claim.

It's definitely blocking the event-bus in 2.14 and 2.15. That's why I implemented the listener -> queue -> publisher pattern in the rabbitmq plugin, I suggest you take a similar approach that does not depend on external resources to be able to handle the event. From a quick search I cannot find any changes in Gerrit core that matches these criteria. Could it be changes in the events plugin that was referenced?

@erik-edling and I checked the changeMergedListener and started two submits on gerrit with a breakpoint inside prepareAndSendEiffelEvent. Gerrit spawned a thread for each event triggered resulting in two separate threads, so it does not appear to block subsequent events. Are you sure this is the case and if so, how could we reproduce this?

sselberg commented 4 years ago

@erik-edling and I checked the changeMergedListener and started two submits on gerrit with a breakpoint inside prepareAndSendEiffelEvent. Gerrit spawned a thread for each event triggered resulting in two separate threads, so it does not appear to block subsequent events. Are you sure this is the case and if so, how could we reproduce this?

So after this you should have two threads that are blocked on prepareAndSendEiffelEvent. What happens if you keep the breakpoint and do 1000 additional submits?

Christoffer-Cortes commented 4 years ago

@sselberg @magnusbaeck I've written an issue here. I've not had time to look into the rabbitmq plugin yet but I imagine we need to queue up jobs that are worked by a thread executor or something like that. For now we'll review and merge this as is and I will continue the work on the other issue. Thanks for bringing this to our attention.