IBM / core-dump-handler

Save core dumps from a Kubernetes Service or RedHat OpenShift to an S3 protocol compatible object store
https://ibm.github.io/core-dump-handler/
MIT License
131 stars 40 forks source link

Update event handler to use temp file #137

Closed gonzalesraul closed 1 year ago

gonzalesraul commented 1 year ago

@No9 while working with a sidecar container to listen to inotify CREATE events, I found issues to read the json such as:

error processing file: /var/mnt/core-dump-handler/events/1e375e3d-867e-437b-bd57-355d326e965d-event.json: unexpected end of JSON input.

I noticed that the handler was using the final path to buffer out the content.

By using a temp file and move the final copy to the target event dir, I was able to successfully read the created files.

My thoughts on that is that we can guarantee new files to the event dir being ready to consume, as there is no other file system event that let us know when it is ready of it, the workaround for whoever needs to use the feature would be to set some delay/timer to ready

Signed-off-by: Raul Gonzales gonzalesraul03@gmail.com

No9 commented 1 year ago

In the main agent we use the inotify close event not the create event. https://github.com/IBM/core-dump-handler/blob/5ef6c8e375372f44933990214b044f59505b3b51/core-dump-agent/src/main.rs#L236

And double check it by ensuring there isn't an advisory on the file. https://github.com/IBM/core-dump-handler/blob/5ef6c8e375372f44933990214b044f59505b3b51/core-dump-agent/src/main.rs#L295

If there is a reason you can't check for the inotify close event we can take in this patch as it will improve the amount of misses but really you should consider a scheduled task that checks the advisory.

gonzalesraul commented 1 year ago

Got it, well the tool is written in go and uses https://github.com/fsnotify/fsnotify , as such, that does not support the inotify CLOSE event, only the CREATE one, by any chance can we apply that patch then ?

No9 commented 1 year ago

OK so the go kubernetes folks seems to use this iNotify https://github.com/kubernetes/utils/tree/master/inotify which seems to have support for close. Is this an option? If not I also notice this PR writes to /tmp on some public cloud k8s none secured binaries like cdc were locked down on writing to certain locations. Would it be possible to write a .tmp into the same folder so that this PR won't require additional verification.

gonzalesraul commented 1 year ago

@No9 yes It is, I am closing this PR, as you pointed I'll be working with syscall directly watching for syscall.IN_CLOSE_WRITE. Thanks for the help