When margebot encounters a network error during the merging process (e.g., timeout when checking the CI status), it fails hard and adds a "I'm broken inside" comment. Because network errors can be transient, margebot should retry failed network requests.
I can try to make a PR for this, but I have the following questions:
Do you agree with the diagnostic and with the main proposal?
I'm not sure where to make the necessary modifications in the code. Specifically, I'm not sure what granularity to have:
At the high-level of single_merge_job's execute: we won't miss any errors (at least not during the merging process)?
At a lower level (fetch_approvals and update_merge_request_and_accept or even lower): we have more specific context for what stage failed?
At a higher level (outside of single_merge_job)?
At a different level altogether such as by patching the self._api object or changing some configuration of the underlying http request library.
When margebot encounters a network error during the merging process (e.g., timeout when checking the CI status), it fails hard and adds a "I'm broken inside" comment. Because network errors can be transient, margebot should retry failed network requests.
I can try to make a PR for this, but I have the following questions:
Do you agree with the diagnostic and with the main proposal?
I'm not sure where to make the necessary modifications in the code. Specifically, I'm not sure what granularity to have:
single_merge_job
'sexecute
: we won't miss any errors (at least not during the merging process)?fetch_approvals
andupdate_merge_request_and_accept
or even lower): we have more specific context for what stage failed?single_merge_job
)?self._api
object or changing some configuration of the underlying http request library.