fabric8io / fluent-plugin-kubernetes_metadata_filter

Enrich your fluentd events with Kubernetes metadata
Apache License 2.0
350 stars 166 forks source link

/var/log/containers/*.log and /var/log/pods/<podUID>/<containerName>_<instance#>.log #105

Closed kenfdev closed 2 years ago

kenfdev commented 6 years ago

Hi, first of all thanks for this awesome filter for fluentd on kubernetes. Recently, I bumped into this issue and was curious to know what the right way is (or is going to be).

According to the comment, /var/log/containers/*.logis going to be deprecated. Does this mean fluentd should monitor /var/log/pods/*/*.log instead of /var/log/containers/*.log ? In addition, should the tag_to_kubernetes_name_regexp (filter regex) option extract meta info from a file like below:

/var/log/pods/c079422d-e3cb-11e7-8643-000c29573710/calico-node_0.log

Note that the file above doesn't have namespace, container_name, docker_id as meta info any more.

I appreciate what insights there are. Thanks in advance.

richm commented 6 years ago

Does this mean fluentd should monitor /var/log/pods//.log instead of /var/log/containers/*.log ?

Yes

Note that the file above doesn't have namespace, container_name, docker_id as meta info any more.

Good question. I have no idea how we will get that information. I suppose the code in this plugin will need to change to query the k8s api, given the pod uuid, to get back the other data, but I don't even know if that is possible . . .

kenfdev commented 6 years ago

Thank you for the response! I'm glad to know I'm on the right track, but yes, that leads to the issue about meta data. I'm pretty sure this plugin isn't the only one relying on the file names... Wonder what others are going to do about it.

koalalorenzo commented 6 years ago

Do we have any way to implement it? We have a case that we are capable of reading only from /var/log/pods/*/*.log. Any suggestion?

richm commented 6 years ago

Any suggestion?

Get upstream Kubernetes to write files containing the namespace/pod/container metadata embedded in the filenames as they are with /var/log/containers/*.log? or otherwise provide the metadata?

stepierc commented 5 years ago

Get upstream Kubernetes to write files containing the namespace/pod/container metadata embedded in the filenames as they are with /var/log/containers/*.log? or otherwise provide the metadata?

Based on the information (see previous comment) from Kubernetes, the links in /var/log/containers definitely going away, and a replacement with the appropriate Kubernetes namespace and pod name are definitely not going to be available in the future. The only thing that is available from the filename will be the pod UUID.

Since the filter has an API watcher pulling in all the pod metadata (including pod UUID), it should be possible to create a map of pod UUID to kubernetes metadata, using the existing resources (API watcher).

I have written something like this for a different problem. We capture logs in our OpenShift environment written to emptydir volumes, and send them to various output plugins based on namespace/project annotations. We add fields to the events based on the source file name and the emptydir volume name. We get that data from the filename, along with the pod uuid. I have a need to populate kubernetes metadata based on the pod uuid.

I have POC code running in non-production that does this, but uses the linktree in /var/log/containers/* to create the pod UUID mapping table. I'd like to switch to using the API data. Would the community want patches to start using pod UUID rather than namespace/podname in the event tag for enriching Kubernetes data?

richm commented 5 years ago

I haven't seen any evidence that removing the old /var/log/containers/*.log naming convention is immanent. I would like to see such evidence. There are a lot of github issues that say "will" "going to" "should" but I haven't seen an actual PR.

That being said - yes, it would be nice to be future proof.

jcantrill commented 3 years ago

Best I can read from the referenced proposal and subsequent links, there is no forth coming action. Labeling as 'future'

jcantrill commented 3 years ago

@alanconway please review to see if you can find any details about this coming anytime soon

alanconway commented 3 years ago

@jcantrill I think we should switch to the approach outlined in https://github.com/fabric8io/fluent-plugin-kubernetes_metadata_filter/issues/105#issuecomment-520990765 - take only UID from the filename, get everything else from the API. That will work now, so we can get it working at leisure and be ready for the k8s change, rather than scrambling to adapt when the k8s change lands.

jcantrill commented 3 years ago

@jcantrill I think we should switch to the approach outlined in #105 (comment) - take only UID from the filename, get everything else from the API. That will work now, so we can get it working at leisure and be ready for the k8s change, rather than scrambling to adapt when the k8s change lands.

Correct me if I am wrong but I read the approach as this:

alanconway commented 3 years ago
* missing meta make the log info undiscoverable by anything useful in future (e.g. namespace/name) unless you happen to have the UUID

True, there's a window between pod deletion and log file deletion where we wouldn't be able to discover anything about the pod if the collector is started during that window. I guess we should stick with file discovery for now, but it sounds like we will have no choice at some point if the changes being discussed are implemented.

PaulAnnekov commented 3 years ago

We already faced this problem on Azure. The AKS (Azure Kubernetes Service) is using containerd by default. And the folder /var/log/containers doesn't exist at all. The logs are stored only in files like: /var/log/pods/tasks_mypod_00ef5a10-3110-4344-af8b-ddfbc0a00d4e/runner/0.log (the data inside is in CRI format). So we have namespace (tasks), pod name (mypod), pod UUID (00ef5a10-3110-4344-af8b-ddfbc0a00d4e) and container name (runner). We have fresh k8s version, <1.14 has different structure.

We tried to use tag_to_kubernetes_name_regexp parameter, but it waits for a docker_id inside a path, which we don't have. So looks like kubernetes_metadata plugin doesn't work for containerd at all. Also, note that Docker engine is deprecated in newer Kubernetes versions and it looks like /var/log/containers path is created for Docker engine only. So no Docker engine - no /var/log/containers.

PaulAnnekov commented 3 years ago

Sorry, I've rechecked AKS node /var/log folder and found that /var/log/containers exist. The directory and logs structure are the same as before. Just the contents of the files inside /var/log/containers are now in cri format, not json. So we can continue to use this plugin.

image

jcantrill commented 3 years ago

Sorry, I've rechecked AKS node /var/log folder and found that /var/log/containers exist. The directory and logs structure are the same as before. Just the contents of the files inside /var/log/containers are now in cri format, not json. So we can continue to use this plugin.

The format of the logs is irrelevant to this plugin. JSON parsing was removed many, many versions ago

PaulAnnekov commented 3 years ago

@jcantrill you're right. That's why I said

So we can continue to use this plugin

jcantrill commented 3 years ago

We tried to use tag_to_kubernetes_name_regexp parameter, but it waits for a docker_id inside a path, which we don't have. So looks like kubernetes_metadata plugin doesn't work for containerd at all.

I would not expect this plugin to "wait" for anything. I might expect it to provide no meta if it can not find a "docker_id" capture group unless you mean it throws an error. Looking at the code this appears to only be used to populate the "docker" part of the hash and should result in nil if it can not find that capture (TBD: test required).

I tested the following for config parameter tag_to_kubernetes_name_regexp:

^\/var\/log\/pods\/(?<namespace>[^_]+)_(?<pod_name>[a-z0-9]([-a-z0-9]*[a-z0-9])?)_(?<pod_id>[-a-z0-9]*)(?<docker_id>_.*)?\/(?<container_name>.*)\/.*\.log$

Which matches the file path sample you provided and will work with the existing code with no modifications. It will provide 'nil' for the docker id in lieu of:

IndexError (undefined group name reference: docker_id)

Also, note that Docker engine is deprecated in newer Kubernetes versions and it looks like /var/log/containers path is created for Docker engine only. So no Docker engine - no /var/log/containers.

I don't know this to be relevant other then the metadata produced includes a 'docker' hash that includes the container ID. This should likely be renamed to something like 'container' or 'CRI'. We are only capturing the container hash which may or may not be useful as meta.

jcantrill commented 3 years ago

I believe one fix to be to only populate docker_id meta in "docker" hash if found. May need some additional documentation to define what is required in the regex

jcantrill commented 3 years ago

@kenfdev you originally asked the question regarding a log in the following format:

/var/log/pods/c079422d-e3cb-11e7-8643-000c29573710/calico-node_0.log

which is different from @PaulAnnekov :

/var/log/pods/tasks_mypod_00ef5a10-3110-4344-af8b-ddfbc0a00d4e/runner/0.log

Are you able to comment on the source of your sample and/or why it varies?

PaulAnnekov commented 3 years ago

Which matches the file path sample you provided and will work with the existing code with no modifications. It will provide 'nil' for the docker id in lieu of:

I checked the code and found it probably breaks the caching. So plugin will make an API request for each log message. But if you will mark pod UUID as docker_id that should work.

himanjanpati commented 3 years ago

@PaulAnnekov , We use AKS k8s version 1.21.2 and this is /var/log directory structure. It does not have any containers folder neither the pod log folder in the node. However, any idea where does does Flunetd gets the container logs and pods which are created as symlink in the fluentd pod inside var/log folder.

Also I am getting issues while fluentd collects logs and lots of "//" in the log. Please can you help what config changes need to be done?

Thanks

jcantrill commented 2 years ago

This "sort of" works but all the meta is not returned, likely because it looks through the pod meta using container hash of which it is unaware in this format: tag_to_kubernetes_name_regexp 'var\.log\.pods\.(?<namespace>[^_]+)_(?<pod_name>[a-z0-9]([-a-z0-9]*[a-z0-9])?(\.[a-z0-9]([-a-z0-9]*[a-z0-9])?)*)_(?<pod_uuid>[a-z0-9-]*)\.(?<docker_id>(?<container_name>.+))\..*\.log$' I'll look at fixing such that it searches by container name for that info

jcantrill commented 2 years ago

To clarify based on looking at the file path I find on a running cluster, the correct format looks to be:

/var/log/pods/<namespace>_<podname>_<poduuid>/<containername>/0.log
jcantrill commented 2 years ago

fixed in https://rubygems.org/gems/fluent-plugin-kubernetes_metadata_filter/versions/2.9.3

zhangguanzhang commented 1 year ago

https://github.com/kubernetes/kubernetes/blob/master/pkg/kubelet/kuberuntime/kuberuntime_manager.go#L73 https://github.com/kubernetes/kubernetes/issues/98473