Closed AbhiPrasad closed 2 years ago
Here's the functional form (since we only have 1 method that transport will override, that is makeRequest
)
function createTransport<O extends BaseTransportOptions>(
options: O,
makeRequest: (request: TransportRequest) => PromiseLike<TransportMakeRequestResponse>,
): INewTransport {
const buffer = makePromiseBuffer(options.bufferSize || 30);
const rateLimits: Record<string, number> = {};
const flush = (timeout?: number): PromiseLike<boolean> => buffer.drain(timeout);
function send(envelope: Envelope, type: SentryRequestType): PromiseLike<TransportResponse> {
const request: TransportRequest = {
// I'm undecided if the type API should work like this
// though we are a little stuck with this because of how
// minimal the envelopes implementation is
// perhaps there is a way we can expand it?
type,
body: serializeEnvelope(envelope),
};
if (isRateLimited(rateLimits, type)) {
return rejectedSyncPromise(new SentryError(`oh no, disabled until: ${rateLimitDisableUntil(rateLimits, type)}`));
}
const requestTask = (): PromiseLike<TransportResponse> =>
makeRequest(request).then(({ body, headers, reason, statusCode }): PromiseLike<TransportResponse> => {
if (headers) {
updateRateLimits(rateLimits, headers);
}
// TODO: This is the happy path!
const status = eventStatusFromHttpCode(statusCode);
if (status === 'success') {
return resolvedSyncPromise({ status });
}
return rejectedSyncPromise(new SentryError(body || reason || 'Unknown transport error'));
});
return buffer.add(requestTask);
}
return { send, flush };
}
Please help me here with my thought, I have no context.
Why do you need that flush thing there? If that is required, is that always
there?
I remember the SessionFlusher
class, and I remember that you didn't need the flush
thingy to be part of the Transporter, the Transporter itself was something that will be used by the SessionFlusher but didn't do anything other than "Send this info".
type Transporter = (request: TransportRequest) => PromiseLike<TransportMakeRequestResponse>;
Please help me to understand When
and Why
the flush exists, I tried to look for in the codebase but I got lost.
Something I like from your Functional implementation is that it is really easy for me to see that you don't actually want people having to mess around with buffering and the Transport itself is an actual function that is a combination with the buffering (among whatever you want it to be).
This means you can reduce the Transporter to be just:
type Transporter = (request: TransportRequest) => PromiseLike<TransportMakeRequestResponse>;
Which you will never communicate directly, and your transporter will never require to deal with buffering.
In the OOP style, you keep entering the realm of abstract
things and methods and attributes when you barely have an actual state (closures will do it), so I am biased, besides the extra bytes that you can potentially save.
Composition is easy to adapt, complexity is hard to pull apart.
Otherway to look at this is that buffering, rate-limiting or anything else is a simple wrapper (Closure, middleware) around the transporter (Like Redux if you are familiar with it), for example:
type Transporter = (request: TransportRequest) => PromiseLike<TransportMakeRequestResponse>;
const rateLimiting = (opts: { rate: number }) => (next: Transporter): Transporter {
const internalState = '.....'
return ()=> {
if (cantSendRightNow) {
// ...
}
return next();
}
}
const buffer = (opts: { size: number }) => (next: Transporter): Transporter {
const internalState = '.....'
return ()=> {
if (DoINeedToBuffer) {
// ...
}
return next();
}
}
const finalProductionVersion = compose(
buffer,
rateLimit,
fetcherTransporter
);
Something around those lines.
P.S: a simple implementation of the idea https://github.com/straw-hat-team/fetcher
What I would do first is to focus on the usage of such Transporter, meaning, finding where we use the Interface as of today, make sure that we did need an interface instead of a simple function, and work backward to the edge.
I am gonna try to be more helpful, please push back and put the ownership of my side to explain myself. I wish I have some standups 😄 programming is hard
Please help me to understand When and Why the flush exists, I tried to look for in the codebase but I got lost.
We decided a long time ago that our capture calls should be fire-and-forget, and users should not be able to await for a singular request. flush
however is there, so we can enable users to await on delivery of all requests prior to exiting the process, be it during crash (eg. in node) or before freezing the process (AWS Lambda for example).
It's in majority of cases not used, and is required only for specific needs.
Otherway to look at this is that buffering, rate-limiting or anything else is a simple wrapper (Closure, middleware) around the transporter (Like Redux if you are familiar with it), for example:
composition here is a really interesting suggestion, but considering the relatively fixed functionality of the transport, we can probably get away with the sole function constructor.
What I would do first is to focus on the usage of such Transporter, meaning, finding where we use the Interface as of today, make sure that we did need an interface instead of a simple function, and work backward to the edge.
Yup, great point. This is the inspiration for this GH issue in general.
We probably need to expand the interface for client outcomes
interface INewTransport {
// TODO: How to best attach the type?
// send an Sentry envelope to Sentry
send(request: Envelope, type: SentryRequestType): PromiseLike<TransportResponse>;
// flush everything in the transport buffer
flush(timeout: number): PromiseLike<boolean>;
// record client outcomes https://develop.sentry.dev/sdk/client-reports/
record(reason: Outcome, category: SentryRequestType): void
}
There's also been chats about how exactly we migrate this, as we want to minimize the amount of time we are on the major branch.
We can first introduce the new Transport interfaces and have tests around them.
We can then convert the client to send all envelope items to the new transport. This does mean we'll have to accept a temporary bundle size increase. We'll have to flag this somehow if a user provides a custom transport, as that will override everything. Since we use category based rate limits, this should still have rate limiting work fine.
public sendEvent(event: Event): void {
if (event.type === 'transaction' && this.notCustomTransport()) {
void this._newTransport.sendEnvelope(eventToEnvelope(event), 'transaction').then(null, reason => console.error(reason));
} else {
void this._transport.sendEvent(event).then(null, reason => console.error(reason));
}
}
^ How does that sound?
There is now an issue of the buffer size though, since having 2 transports running will mean two buffers (double total amount of items). @kamilogorek you think that is fine?
Newer Node versions introduced https://nodejs.org/api/worker_threads.html. It would be interesting to see how to use the functional transport while it is running on a separate thread.
Similar with a web worker transport?
We need to also make sure the makeRequest
works well for offline transports. This might mean that individual transport implementations might need to be able to configure a custom buffer data structure - though it might just mean that if "offline" mode is active, the base transport from createTransport
just re-queues up the error.
I'm struggling to think through the best design for an offline system though. Maybe the base transport try catches makeRequest
?
After writing a lot, I found some stuff that brought me way too many thoughts, by the moment I got to the end and read more code something lit up.
Leaving some notes for myself, feel free to expand them, my apologies if it bothers you, just leaving a paper trail of thoughts that hopefully are useful at some point for someone.
@AbhiPrasad is there any opportunity or way that I could hang out with you on Discord or something real-time? Maybe pairing on some coding?
Hopefully could be helpful for you.
@yordis I’m on discord - and available to chat anytime after 14:00 CET next week Monday/Tuesday/Wednesday.
well, I am walking up early next week for you. Gonna send some DM on Discord.
// TODO: How to best attach the type?
Why do we even need type if everything is going to be inside an envelope header?
// makes it way harder for users to implement their own transports
Why tho? If we provide all generic functions, it shouldn't be thaaat bad. Also as you pointed out in your functional example, makeRequest
is the only thing people should need to provide to create their own transports.
We probably need to expand the interface for client outcomes
I'm not sure about that tbh. It's quite hard to justify the existence of this functionality inside the transport. I never liked it, but we added it because there is another easy way currently. If anything, i'd add a method for sending reports via beacon (eg. report
method, not record
).
I'd much prefer it to be something client-bound, rather than make it live inside the transport. It'd also be possible to tree-shake it if someone decides to not use client reports. (not based on this dynamic boolean in the options, but there are ways to make such features treeshaken, eg. see https://github.com/getsentry/sentry-javascript/blob/v7-dev/packages/browser/src/sdk.ts#L22-L53 for how I made defaultIntegrations
tree-shakeable - tl;dr when using raw BrowserClient
, BrowserClient
wont be used, thus dropped).
^ How does that sound?
As a temporary workaround it should be enough.
There is now an issue of the buffer size though, since having 2 transports running will mean two buffers (double total amount of items). @kamilogorek you think that is fine?
That's fine IMO, transactions are usually not flooding the buffer anyway, and your additional branch will only serve those.
I'm struggling to think through the best design for an offline system though. Maybe the base transport try catches makeRequest?
It should IMO be prior to actual sending call. For node we don't need offline really, as Id not expect servers to have connectivity problem. Its more important to target mobile/browsers. For those, we can rely on browser APIs to detect internet connection. It's not bulletproof, but pinging DNSs before each call is an overkill IMO. And it takes time to rely on request timeouts, especially that they can stay in the buffer for the time they are being resolved.
How many times does the flushing implementation change based on the life-cycle of the application? Does the flushing implementation change per environment (testing, prod, dev)? Does the flushing implementation change per runtime environment? (react native, node, browser)?
No, no and now. Flushing here is nothing more than locking the buffer and calling await
on it until it drains :)
The Hub is something that deals with flushing, not the transporter or something like that.
It's only like that because hub
is the proxy to the client, which in turn is bound to a given transport instance.
Most of sdk.ts files per package are almost the same and in some cases just re-exporting from the previous one. Each package files more like a series of metadata and integrations based on the environments. Why you wouldn't make the Hub package to be the thing that you initialize and the packages Browser, React, ... Node; don't hide the fact there is a hub thingy. Something around the lines of:
This is effectively what init
of each sdk.ts
does right now. It adds default integrations/options for the given environment, but other than that, it uses shared code. I do agree though that plenty of code should be extracted even more. Even the flush
of all implementation is basically the same, yet we duplicate this code.
Why do we even need type if everything is going to be inside an envelope header?
It's because it's typed in the envelope item headers - not the envelope headers in general. This means we have to reach in and grab the envelope item header. I guess that is fine.
I'd much prefer it to be something client-bound
Yeah this makes sense, but I'm gonna not solve it here tbh - let's just do it later.
I'm not sure about that tbh. It's quite hard to justify the existence of this functionality inside the transport. I never liked it
So happy to see my thoughts are aligned with yours @kamilogorek your response was actually the conclusion I was getting at the more I read code and understood things.
It should IMO be prior to actual sending call. For node we don't need offline really, as Id not expect servers to have connectivity problem.
In the Electron SDK, we support offline by wrapping the default transport and persisting failed envelopes to try later: https://github.com/getsentry/sentry-electron/blob/master/src/main/transports/electron-offline-net.ts
Its more important to target mobile/browsers. For those, we can rely on browser APIs to detect internet connection. It's not bulletproof...
It's more common to have navigator.onLine === true
and not be able to make a connection than it is to ever see navigator.onLine === false
. In my mind , this makes using it to detect connection status next to useless because you've got to cater for the timeouts/failures anyway. All you know for sure is that if it's false
there's no point even trying.
If you install Docker, Virtual Box, VMWare or any VPN client navigator.onLine
is ALWAYS true
, even with aeroplane mode enabled. This is even true for VPNs and Android!
but pinging DNSs before each call is an overkill IMO. And it takes time to rely on request timeouts, especially that they can stay in the buffer for the time they are being resolved.
Yes, no real point waiting for DNS to timeout when you can just wait for the request to Sentry to timeout!
If you install Docker, Virtual Box, VMWare or any VPN client navigator.onLine is ALWAYS true, even with aeroplane mode enabled. This is even true for VPNs and Android!
jeez, TIL 🥲
All new transports have been merged in - now just waiting on v7 for us to delete the old ones.
As part of https://github.com/getsentry/sentry-javascript/issues/4240#issuecomment-1035323682, we want to revamp the Transports API to make it easier to extend (adding support for attachments), consolidate the options, and reduce bundle size.
This issue forms a living document on how we are going to tackle this - first by setting up an API that can be (possibly) merged in before v7.
Much of this is inspired from the transport design of https://github.com/getsentry/sentry-javascript/blob/v7-dev/packages/transport-base/src/transport.ts
Options
To start, let's look at the current state of the options:
https://github.com/getsentry/sentry-javascript/blob/b1009b5766b34b77781e28cebbf6cbde870e01a2/packages/types/src/transport.ts#L51-L74
Doing a quick runthrough, here's what that looks like:
This means we can probably reduce it down to:
API
The transport does a couple of things in the SDK, but we can think about it mainly turning a
SentryRequest
into aResponse
of some kind. Due to rate-limiting, the transport is essentially a closed-loop control system, so theResponse
must be of some form to guide the transport (and the SDK in general) in making future decisions about how it should function.SentryRequest<T>
->SentryResponse
.SentryRequest
becomes generic (defaulting toT = string
), so transports can work with buffers/event-emitters/streams as the body if it so pleases. This also leaves us open to have non-http transports (although we'll never probably have that as a default).Assuming that in v7, the client is in charge of creating the envelope (because it has to be able to dynamically add items).
We can actually make this entirely functional
Sticking with classes though
Edit: Have to incorporate client outcomes here - will figure that out in a bit.