Open shaeqahmed opened 1 year ago
I was thinking on a below idea :-
Matano provides a context data (such as object path, s3 bucket , any other stuffs that provides significant value from transforming standpoint [the fields can keep growing if needed in future] ) for every event to the transformer. However this is not enabled by default. This can be part of log schema with a flag provide_matano_context: true
which has to be explicitly set to avoid overhead of generating metadata by matano for every event / log. This data can be part of .matano.context
or __metadata
field of the event that is to be transformed (hoping that matano.context
or __metadata
is not used in any log structure so that it does not come in way of genuine log fields). This raw context can be transformed as part of VRL code by the user. This will help in cases as above where a certain context has to be derived from contextual data provided by matano. This will also avoid tampering or hacking on select_table_from_payload_metadata
.
Problem
AWS ELB does not include AWS account ID in each event payload, this information is only included in the path e.g.
aws-elb-logs/<account-id>/...
. As a user, I would like to be able to query my AWS ELB logs using an AWS account ID field to filter/narrow down events.Ideas
To support this in a generic way in our VRL transform, without impacting performance (requiring synchronization of threads in the hot path via a mutex) we would need to add a custom VRL function for looking up me (
get_payload_metadata_field
). We would also add a function likeset_payload_metadata
that could be used from theselect_table_from_payload_metadata
VRL expression to parse and populate some file level metadata that the corresponding events can lookup later. For example for AWS ELB this may look like (psuedoscript):Then from the transform we could use this info like:
This is a bit too complicated for my liking though and this is a pretty niche use case (only current applications are AWS ELB and Route53 potentially. Generally other sources do (and should) include important metadata in the event rather than relying on the bubbling up context from the path, so I'd like to hold off on implementing a solution for this until it becomes clearer it is worth it.