driskell / log-courier

The Log Courier Suite is a set of lightweight tools created to ship and process log files speedily and securely, with low resource usage, to Elasticsearch or Logstash instances.
Other
419 stars 107 forks source link

unable to get access or read file - notification in log-courier log #327

Closed MKuzma closed 1 year ago

MKuzma commented 8 years ago

This is not really an issue, more like an enhancement. Once we switch log-courier daemon user from root to log-courier it may not have access to all previously harvested files. I believe the only way to observe this is that there is missing a line containing "Started/Resuming harvester" in the log-courier log file. It might be useful in the future to include a log line that log-courier actually can't access the file, if that would be possible.

driskell commented 8 years ago

Due to supporting Glob patterns for the file paths it makes this difficult as Go doesn't report errors. However, what might be useful is at least checking the root directory is accessible as that is likely the most common problem.

I'll implement something to locate the "stem" of the Glob so we can check permissions there.

MKuzma commented 8 years ago

After your comment and as I'm slowly diving into Go language I can see the main cause :)

driskell commented 1 year ago

I don't have bandwidth for this as it's a fairly involved piece so will close. But happy to receive a PR

driskell commented 1 year ago

Found a package that will do /**/ support as a drop in replacement for Glob, and it also has ability to report IO errors. The plan is to only report a single IO error and then start ignoring them (as it aborts the scan if there is an IO error). If a configuration reload happens on the next scan it will then report a single one again. I considered doing it during configuration parsing but for a big ol' tree of logs that will slow down configuration parsing.

For the scenario noted here it should be enough, as you'd be able to check logs and see immediately the first scan didn't work and see just one error on where it had issues. And a reload will then alert on any further after fixing that error.