Closed simitt closed 5 years ago
We've previously decided to keep context
open in order to give us the freedom to dynamically add namespaces under context
in agents if needed, without requiring an upgrade to the Elastic Stack. I believe this is an edge case, but lets keep it open for 6.2 and then consider locking it down for 6.3.
@roncohen should we look into this now for 6.3?
@roncohen can we please revisit this for 6.3 as we realized that e.g. @jahtalab is sending additional data within context
for the frontend agent. It might be legit to do so but we should discuss this, as for example right now we would store the userAgent
information additionally there, although there is already another defined place in the context
to store this information.
In case we decide to add more defined attributes in the future that should be indexed, this would be a breaking change afaict, as it could break existing indices if the new attribute was stored with a different type.
Also we might end up with a completely different schema for different agents.
update: as those fields won't get indexed, changes to the schema should not break anything. However this could still lead to a different data structure in ES than defined or expected.
From an agent developers point of view, I'm ok with locking down the properties under context
.
From time to time we need a new root property under context
. But in those cases we probably need to involve the APM UI team anyway to make sure it's displayed as expected. So I'm fine with adding that overhead.
Especially with the unauthorized endpoint for the RUM agent, a malicious user could send arbitrary data and flood ES. This is a general problem we might want to address with rate limiting, but we might also want to limit which types of fields can be sent. Together with an upper limit for the values (for example 1024 chars), 50 stacktrace elements etc. we effectively have an upper limit for one span, which probably helps.
@roncohen can we revisit if this is a feature or a bug for V2, and then decide and close this issue.
thanks @simitt. @jahtalab which additional information are you sending today?
@roncohen ,
Currently we're sending the following additional fields:
{
page: { referer: "", url: "" }
}
I'm ok with locking down the context and/or setting limits on the values.
I suggest all @elastic/apm-agent-devs are listing the currently added attributes, so we can formalize them on the JSON schema and then stop processing additional data sent within context
.
@alvarolobato, @roncohen can we please move forward with this.
sounds good to me
No added attributes in Ruby agent.
No added attributes in the Java agent.
None for Go.
We are going to aim to do this for V2, but needs more investigation and probably will also be applied to V1. This isn't considered a blocker for V2 GA.
I need to add these additional fields to the span context:
note: the http
context is already defined with url
as it's only field.
{
"http": {
"method": "GET",
"sync": true,
"status_code": 200
}
}
I think we should start adding these additional fields to the v1 and v2 schemas for documentation purposes even though this is not enforced yet. Should I make PR?
Also a general question, will the the whole request fail if there are additional fields or will they just not get stored?
@jahtalab feel free to open a PR on this.
In case additional fields are sent the server will just ignore them, but process the request otherwise. This is already the case for all other fields, to keep the upgrade path easy. (Otherwise the server would need to be updated before the agents can update).
We are going to aim to do this for V2
Trying to catch up - what is the current status, and how is this going to look like with ECS?
We haven't implemented that yet, but according to https://github.com/elastic/apm-server/issues/411#issuecomment-422024597 it would be fine to add it if we manage in time.
@watson @Qard @beniwohli are you sending any additional fields in context
in v1
or v2
that we need to add to context
when locking it?
@jahtalab is this suggestion https://github.com/elastic/apm-server/issues/411#issuecomment-423990546 aligned and ready for implementation?
@simitt , @eyalkoren wanted to add a similar data structure, so now we're discussing this here.
@simitt The Node.js agent adds a few fields under context.custom
. But nothing in the root of context
Referencing Add sync property to spans issue.
We decided in the server meeting to not move forward with this for now but postpone it, as this could be considered to be a breaking change.
Makes sense to move to the 7.0 milestone then?
@roncohen @alvarolobato I would appreciate a decision on this to either close the issue or plan/prioritize 7.0 work accordingly
thanks @jalvz
Reading through the issue, I'm just making sure: Is the intention to change the server so that it would reject unknown fields under context
or just that we should stop storing them?
I'm fine to stop storing them, but I think rejection could be problematic. We're also not rejection unknown fields anywhere on the root, for example, so it would be strange to start doing it here.
In case additional fields are sent the server will just ignore them, but process the request otherwise. This is already the case for all other fields, to keep the upgrade path easy. (Otherwise the server would need to be updated before the agents can update).
Thanks for the update @roncohen. This issue is about stop storing unknown attributes, not rejecting them. We are trying to get this in for 7.0 then (cc @elastic/apm-agent-devs ).
@elastic/apm-agent-devs as discussed offline, the plan is to NOT store additional, not defined attributes sent up by the agents from 7.0 on. If there are no concerns or counter proposals raised we will move forward with this implementation.
We decided to only store objects defined in the json spec for the Intake API sent under the key context
from 7.0 on.
On ES level we don't store any information within context
anymore, that is defined on Intake API level. Therefore the only necessary change left is to ensure context
itself is not stored on ES level.
Json validation allows agents to send up additional information without failing. Ensure that only the defined information is stored in ES, except for
context.custom
andcontext.tags
where we specifically allow for additional values.