aws / amazon-cloudwatch-agent

CloudWatch Agent enables you to collect and export host-level metrics and logs on instances running Linux or Windows server.
MIT License
445 stars 204 forks source link

[Feature Request] Implement additional glob functionality for collect_list.file_path #328

Open kwkeefer opened 2 years ago

kwkeefer commented 2 years ago

In the CloudWatch Agent configuration documentation, the file_path section is described:

file_path – Specifies the path of the log file to upload to CloudWatch Logs. Standard Unix glob matching rules are accepted, with the addition of as a super asterisk. For example, specifying /var/log/.log causes all .log files in the /var/log directory tree to be collected. For more examples, see Glob Library.

It links to the gobwas/glob library which describes a few different glob patterns that do not appear to be implemented in the amazon-cloudwatch-agent package.

From their documentation:

   // create glob with pattern-alternatives list 
  g = glob.MustCompile("{cat,bat,[fr]at}")
  g.Match("cat") // true
  g.Match("bat") // true
  g.Match("fat") // true
  g.Match("rat") // true
  g.Match("at") // false 
  g.Match("zat") // false 

It would be great add this functionality so that these pattern-alternative lists could be used to describe file paths. I tested this and confirmed that it does not appear to be working, see the example from my config file below:

...
  "logs_collected": {
      "files": {
          "collect_list": [
          {
            "file_path": "/var/log/pattern_test/{test1,test2}/**",
            "publish_multi_logs": true,
            "log_group_name": "pattern_test"
          },
...

I also tried:

"file_path": "{/var/log/pattern_test/test4/**,/var/log/pattern_test/test5/**}"

One use case I can think of would be if you want a handful of directories within a file path to be captured, this would be a cleaner way to specify that instead of having multiple elements added to the collect_list with each directory being described individually.

jhnlsn commented 2 years ago

hey! I dug a little more into this and we don't currently support the full pattern alternative brackets and they don't work quite exactly as you're using them either.

In your situation if you had a directory structure like this:

/var/log/pattern_test/test1/...
/var/log/pattern_test/test2/...

You could write your glob like this

/var/log/pattern_test/test[12]/**

** isn't glob syntax, it's something we have implemented on top of glob and as such won't compile into the brackets like you're using it.

Let me know if that helps!