salemove / freddy

Messaging api supporting request-response and acknowledgements.
MIT License
9 stars 0 forks source link

Add timeout to RMQ #87

Open kasesalum opened 1 year ago

kasesalum commented 1 year ago

Requests taking too long and never timing out could cause incidents, hence adding a configurable timeout. In case the timeout isn't added, no timeout is enforced.

kasesalum commented 1 year ago

@indrekj Added the reason, changed the code in a way that should not affect its current usage. I'm having trouble creating tests for it, though. Do you have any suggestions?

indrekj commented 1 year ago

I'm still not convinced that having this kind of high-level timeout is a good idea. See:

Maybe if it's large enough and is triggered on very exceptional cases, it could be ok.

cc @sviik

kasesalum commented 1 year ago

@indrekj Siim is on vacation for 2 weeks. Do you maybe have any suggestions on how to better approach the ticket?

indrekj commented 1 year ago

Hm... If we would go with a timeout then that has to end up as an error and also in the sentry (and opsgenie). The current "logger.warn" will get ignored and if we get into a corrupted state then that would be really bad. We probably would need to make sure the error reaches the actual application and then the application can send it to Sentry.

I also looked into Rack. We don't have a timeout there either.

Rack has "Rack::Timeout" module, but it also warns about the corrupt state. One solution to avoid that is to restart the whole process on timeout.

I'm leaning toward not adding the timeout at all. Or if you do, then let's start on the application level, instead of adding it to the library immediately. And we probably should restart the process.