Add a getNumRemainingStatusRequests() function to SessionImpl to get the current integer as it decreases

nmontana42 commented 7 months ago

I'm interested in creating a PriorityQueue containing n sessions, prioritized by the remaining status requests. The queue would provide a session that is most likely to be configured and open. At the moment there doesn't seem to be public method that obtains that integer. I believe the closest thing is statically referencing the MAX_STATUS_RETRY int declared inside SessionImpl

This seems more robust than using the most recent timestamp

TheHighriser commented 7 months ago

My counter question would be: Why? What is the need for having several parallel session and selecting the one which is ready? What is the use case.

nmontana42 commented 7 months ago

Yeah I probably could have been more descriptive in my previous comment. My organization is in the middle of a system migration from another APM vendor. Their Agent has the same kind of "sendBizEvent()" function that our applications are currently using to send business metrics, you can find that on line 359 here. This is why we want to use OpenKit to capture bizEvents rather than the other methods (Agent/BizEvents API).

Our primary use-cases are a handful of highly business-critical applications that handle large amounts of throughput. One of them makes over a 100 requests a second and anticipates sending 10-12 million bizEvents a day. I don't know if we have any apps that would use OK's User Session capabilities so at the moment lets keep this discussion around sending large amounts of bizEvents.

In our proof-of-concept we identified that it would be better if we customized our OpenKit implementation to handle the synchronous and thread-blocking processes asynchronously. Even if OpenKit minimizes any wait time I've been told that I should try to get it working so that there is always an available session to use by a calling thread. So I guess I'm asking you about any advice you might have here.

My main question is: how would we guarantee that we give a calling-thread an available session? Doesn't seem possible with only one session so maybe a priority queue based-off remaining Status Requests.

Thoughts? Let me know if you have any other questions!

TheHighriser commented 7 months ago

Okay. So if this is really fully about BizEvents and nothing else, it might be better to simply ingest them without any OpenKit overhead? Did you see already: https://docs.dynatrace.com/docs/platform-modules/business-analytics/ba-api-ingest

nmontana42 commented 7 months ago

Hi Matthias,

Yes I am aware of all the different options for sending Bizevents and I've read all public documentation on this topic. To be clear, we still are interested in exploring/using OpenKit's other functionalities but at the moment we are primarily interested in its Bizevents support.

Does the bizevent API method have equivalent load-balancing that OpenKit/BeaconForwarding does? Outage prevention is very important to us and OK sessions provides visibility into overload prevention via status requests. Our bizEvent ingest is going to be high and if an ActiveGate goes down that's a problem. Is OpenKit's session + caching approach more resilient to overload compared to to direct api? That method may be an easier implementation but is it better for our ActiveGates?

Are you able to comment on a detail in the OpenKit docs. OpenKit Bizevents are affected by the custom application cost controller but based off the docs it appears that this may change in the future.

I appreciate your timely responses to my questions, could we return to my original inquiry? Do you have any thoughts?

TheHighriser commented 7 months ago

"Does the bizevent API method have equivalent load-balancing that OpenKit/BeaconForwarding does?" Ingest pipeline has load balancing and prioritization for bizevents (BizEvents dropped last in case of overload) and will drop events if a certain limit is reached but this limit is way above the 100 bizevents a second (that you are mentioning). Obviously the limit is depending on the size of the events, but the API is much more flexible (Amount and Size) than the SDK/Agent. So you could also ingest a 1000+ bizevents a second.
"Is OpenKit's session + caching approach more resilient to overload compared to to direct api? That method may be an easier implementation but is it better for our ActiveGates?" As said above the API is much more generous. Ingest via Agent(OpenKit) has much more limitation than the direct API regarding size and amount of events. Especially the amount of 100 events a second via Agent is already a lot, as the agent is not directly using the ingest pipeline. So it has to go a different route which has much more limitations.
"Are you able to comment on a detail in the OpenKit docs. OpenKit Bizevents are affected by the custom application cost controller but based off the docs it appears that this may change in the future." This is a simple architectural question (which is current limitation) and should emphasize that in the future we want to send events all the time whenever it is possible us.

nmontana42 commented 7 months ago

Matthias,

This is excellent information, thank you. Just to be clear our application will not be sending 100 bizevents per second, it makes other service requests that total >100 requests/second. Our most mature app will be sending 10-12 million bizEvents per day, not 100 bizevents a second, although its cool that ActiveGates can handle that. I also didn't think of OpenKit as an "Agent" itself but it makes sense to use those words (OpenKit/Agent) interchangeably.

We are still interested in the OpenKit method for two reasons.

First, its better for our architecture to use a dependency that supports bizevents rather than writing a natively-built API layer. We distribute these capabilities out to app teams through common proprietary libraries. For example app teams get custom metrics through our monitoring framework that auto-configures Micrometer. This dependency has roughly 1000 users. Secondly, we would also get RUM + Custom Application capabilities on top of bizevents. We'll avoid a lot of redundant work the more features we include in this project.

TheHighriser commented 7 months ago

Sorry about the delay, I was OOO some days. Are there any open questions regarding OpenKit? Or are we still talking about your initially inquiry?

If so, where do you see the MAX_STATUS_RETRY? Or do you mean "numRemainingNewSessionRequests" in SessionImpl? Did you see https://github.com/Dynatrace/openkit-java/blob/main/src/main/java/com/dynatrace/openkit/core/communication/BeaconSendingContext.java#L561 - So basically the state of a session should be the best insight into if a session is configured and ready to go.

Dynatrace / openkit-java

Add a getNumRemainingStatusRequests() function to SessionImpl to get the current integer as it decreases #209