Open beckje01 opened 6 years ago
Hi, jeff.
Which storage backend are you referring to? We've definitely had something similar asking for a tag whitelist to index for cassandra.
On Tue, Aug 28, 2018 at 3:17 AM Jeff Beck notifications@github.com wrote:
Feature: Allow adding tags that will not be indexed OR are somehow optional to index.
Rational Some tags that are useful in a span for debugging, are high cardinality so we may not want to search by them. Having a way to add unindexed tags allows users to not blow out the storage layer for the team operating zipkin.
Example Scenario While adding tags to spans about our API calls, we wanted to tag oauth client ids, while we also keep things like user uuid but ideally we don't need/want to index all the uuids.
— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/openzipkin/zipkin/issues/2178, or mute the thread https://github.com/notifications/unsubscribe-auth/AAD618IwgZrmCusM5nID8-64tcAE_dsRks5uVFPOgaJpZM4WObqo .
Personally I use Cassandra.
The idea of making it something I can opt into while adding tags makes sense to me personally. I could express that this tag didn't have to be searchable vs having to have a whitelist maintained by the operator of zipkin.
But if there is either solution we probably don't need both.
I was thinking it would make sense to allow the storage backends decide what to do with the unindex state of tags since the cost of index may be vastly different.
Here's the former thing about whitelist: https://github.com/openzipkin/zipkin/issues/1928
I'm really not sure we will be able to control all indexing (ex mysql), so this would be a request to not index vs a requirement to not index.
either way, there's work for the operator to do, unless you are hinting at a data model change, which would be vastly more work.
if this is a request about a data model change, I don't think realistically we can do it this year. It is several months to a year of effort to move a data model through the ecosystem and most aren't even completely v2 yet. That doesn't mean we shouldn't track it.
Honestly this request was intended to see if this is a feature people would find useful. I think it makes sense but wanted to see about a more broad appeal. I don't expect any immediate changes. If the feature seems like a good direction instead of the whitelist #1928 I'm happy to try and come up with some possible technical solutions and propose them.
The more we have talked I was leaning toward this being something that notes a tag as something that has eligibility to not be indexed.
sounds good jeff. FWIW amazon has this feature "metadata", so there is prior art. https://docs.aws.amazon.com/xray/latest/devguide/xray-api-segmentdocuments.html#api-segmentdocuments-metadata
Let's leave it open independently, as I personally would also like this as it makes data management in general easier. For example, we should never attempt to index queries!
Metadata is a way better way to put it mind if I edit the issue to reflect that terminology?
Metadata is a way better way to put it mind if I edit the issue to reflect that terminology?
go for it!
Feature: Allow adding metadata to spans that will not be indexed.
Rational Some info that is useful in a span for debugging, are high cardinality so we may not want to search by them. Having a way to add metadata allows users to not blow out the storage layer for the team operating zipkin.
Example Scenario While adding metadata to spans about our API calls, we wanted to tag oauth client ids, while we also keep things like user uuid but ideally we don't need/want to index all the uuids.
Prior Art