Open hmdhk opened 5 years ago
I did couple of experiments checking the payload size of our current master with few changes that should not result in any issues with APM Server and Kibana. The below tests are really small changes that we can do today without modifying any of the schema intake API changes in APM server level.
Payload Size - 23745 Bytes
transaction.marks.navigationTiming
. These fields are not used in the UI.context.http
for the resource timing spans since the Span Name matches with the URL of the request. span.action
which is not applicable for all of the resource timing spans. Payload Size - 18412 Bytes (~4kB reduction)
span.subType
for resource timing spans since the UI currently does not distinguish between CSS, Images or JavaScript. This change might impact the UI in future, But I could not think anything broken at the moment
Payload Size - 17449 Bytes (~5kB reduction)
Even though step2 is not in the right direction, I would like to get thoughts on both of the steps. We can look in to advanced techniques like compressing the payload, sending in chunks later on if becomes a huge concern.
Thoughts @jahtalab @roncohen @alvarolobato ?
These measures are all uncompressed right? Do you know the compressed values and improvements? What are we trying to achieve here? reduce APM server load? Bandwidth? ES storage? All of it?
Yes all the measurement above are uncompressed, We are trying to optimise the Payload size that we are sending to the APM server which in turn would reduce the bandwidth consumption for the users.
Had a meeting with @vigneshshanmugam, we discussed the specific optimisations we want to do:
span.trace_id
, Should take the field from the transaction if they don't exist on the span.span.parent_id
, Should take the field from the transaction if they don't exist on the span.span.transaction_id
, Should take the field from the transaction if they don't exist on the span.span.sync
default should be false
span_count
, the default should be calculated based on the number of spansnavigationTiming
We should add a configuration for this and also remove some of the marks that we think is not very useful.start
and duration
should use integer instead of floathttp.url
-> we should discuss whether it would be a good solution to use the span.name
instead (only for resource timing spans).We're currently block by optimization on apm server
Does the span_count
need to be the actual value for unsampled transactions? if so the agent needs to send the value for unsampled transactions.
cc @jalvz
@jahtalab the original spec for span_count
is here: https://github.com/elastic/apm-server/issues/280
We will initially have a look at the current payload size and see if there are obvious things to improve.