Transport `send_envelope` should return a `Result`

timfish commented 2 years ago

I would like to create an offline transport that queues envelopes on disk if no connection is available. The only way to currently do this is to copy one of the existing transports and add the offline queueing. This is far from ideal because this either needs to be implemented for every one of the existing transports and keep up with any upstream changes or limit the underlying transports the the offline mode supports.

If send_envelope returned a result, it would be possible to wrap any of the existing transports with an offline transport. The client could either ignore the result or errors could be logged at that point rather than currently in every transport.

Swatinem commented 2 years ago

I think this is fair to do.

How would this look like in the end? Thus far the transports are designed as "fire and forget" and they should not block the thread that captures events for any longer than absolutely necessary.

How would you detect the connection state? What if there are envelopes queued already and the connection is lost?

timfish commented 2 years ago

I experimented with channels for returning the result but this on its own would obviously block.

Maybe we could have some sort of transport result wrapper so that callers can choose whether to block and wait for the result.

Much like most of the other SDKs, we probably can't reliably detect if a connection is available. We'll just need to separate out connection failures from rate-limit or other errors that will result in different queuing/retry logic. Off the top of my head I can't remember which ones should result in what!

Swatinem commented 2 years ago

You can always return a Future, but document the requirements around that:

callers (the SDK Client / transport wrappers) don’t need to await/poll it, they can just ignore/drop it right away and things will work just fine
the transport needs to make sure the future is spawned on its internal runtime and does not block the submitting thread.

As you said, modeling the result/error type would be difficult now, as it would leak implementation details of the transport, more specifically its Error type. Unless you require that to be a boxed error, which ofc we can do no problem. If you want to peek deeper you can always try to downcast.

timfish commented 2 years ago

You can always return a Future

A Future would work but I guess this would require Transport and any implementations to use the async-trait crate which isn't currently a dependency. Not sure if that's an issue or not?

As you said, modeling the result/error type would be difficult now, as it would leak implementation details of the transport

For errors I was going to use an enum with a few categories including some sort of general transport failure that would contain a boxed error and then one for RateLimted etc.

pub enum SendEnvelopeError {
  Failure(Box<dyn Error>),
  RateLimited,
  Timeout,
}

pub type SendEnvelopeResult = Result<(), SendEnvelopeError>;

#[async_trait]
pub trait Transport: Send + Sync + 'static {
    async fn send_envelope(&self, envelope: Envelope) -> SendEnvelopeResult;

If you want to avoid making Transport async, SendEnvelopeResult could be a simple wrapper for one end of a channel with a wait function. This way, if wait is not called, no blocking will occur. This has the downside that channels will be created and not used in non-waiting cases.

pub struct SendEnvelopeResult {
  receiver: Receiver<Result<(), SendEnvelopeError>>
}

impl SendEnvelopeResult {
  pub fn wait(timeout: Duration) -> Result<(), SendEnvelopeError> {
    self.receiver.recv_timeout(timeout).unwrap_whichever(SendEnvelopeError::Timeout)
  }
}

You could take this one step further and have all the transports return the status_code and headers on success. This way the rate limiting could itself be written as a wrapper around all the other transports rather than RateLimiter being passed into every transport. This would allow you to pull the rate limit checks and filtering out of TransportThread.

pub struct SendEnvelopeSuccess {
  status_code: StatusCode, // or something more universal like u16?
  sentry_rate_limits: Option<String>,
  retry_after: Option<String>,
  // or return all headers so transport wrappers can deal with other headers in the future?
}

pub type SendEnvelopeResult = Result<SendEnvelopeSuccess, SendEnvelopeError>;

#[async_trait]
pub trait Transport: Send + Sync + 'static {
    async fn send_envelope(&self, envelope: Envelope) -> SendEnvelopeResult;

The JavaScript transports were recently refactored to be simple functions returning something similar to above and use a wrapper to handle rate limiting. This has the added bonus that when customers create their own transport, it's trivial for them to include the correct, tested rate limiting logic!

getsentry / sentry-rust

Transport `send_envelope` should return a `Result` #482