open-telemetry / opentelemetry-collector-contrib

Contrib repository for the OpenTelemetry Collector
https://opentelemetry.io
Apache License 2.0
2.92k stars 2.28k forks source link

windows paths not supported by filelog receiver (include and exclude parameters) #33657

Open akumetsu183 opened 3 months ago

akumetsu183 commented 3 months ago

Component(s)

receiver/filelog

What happened?

Description

Support of Windows paths (on Windows OS) is problematic for include and exclude.

Inside include, windows \ are not working, and inside exclude, neither one works (\ or /), but it can be worked around.

INCLUDE Examples: ["C:\*.log"] WINDOWS STYLE (NOT WORKING), receiver will log errors ["C:/*.log"] LINUX STYLE, (WORKS)

Based on this, one might think that only / is supported by this receiver, on both Linux and Windows. However that is not true. When configuring exclude, even linux style / is does not work there (on WINDOWS OS). Only when using Windows style, escaped \, will exclude work.

EXCLUDE Examples: ["C:\*exclude.log"] WINDOWS STYLE, (NOT WORKING), receiver will log errors ["C:/*exclude.log"] LINUX STYLE, (NOT WORKING), receiver starts, but exclude is not respected ["C:\\*exclude.log"] WORKING WORKAROUND

From my brief investigation, the exclude issue seems to be happenning here, when comparing found vs excluded paths on windows.

It uses function doublestar.FilepathGlob to find all files, but that returns paths with slashes based on OS, which I suspect returns path with double windows slashes, and that is why the workaround with extra slash works.

Steps to Reproduce

Test the following, on a windows machine. (include and exclude both windows style \) Include and exclude are both broken in this case. (you can try the same with linux style in both parameters /, but then only the exclude will not work, unless you use the workaround mentioned above, with the doule escape)

receivers:
  filelog:
    include:
      ["C:\*.log"]
    exclude:
      ["C:\*exclude.log"]
    start_at: end
    poll_interval: 30s

service:
  pipelines:
    logs:
      receivers:
        - filelog

Expected Result

Include and exclude parameters both support the same set of / or \ or both.

I would expect one of 2 things: 1) Receiver supports linux style / only, for include and exclude. If so, the above should NOT work, but the fact that paths should be only linux style should be mentioned in readme, next to the parameters. Also, both include and exclude should work with linux style /, but it does not. 2) Or, Windows style \ is supported (on windows OS) for both include and exclude the same way, and these paths work for both. Also, readme could mention that paths need to be based on OS.

Cherry on top, receiver could detect OS and validate that it gets paths in expected format, and log if there are some detected issues (linux path used windows OS or vice versa). Or something like that.

Actual Result

In the above scenario with windows \ for both include and exclude, receiver logs errors. See below. No logs are monitored.

""" 2024-06-18T15:22:02.365Z ERROR Processing configuration record has failed {"error": "cannot unmarshal the feature configuration: yaml: line 4: did not find expected hexdecimal number", "feature-name": "filelog"} 2024-06-18T15:22:02.365Z INFO Got new configuration revision for OTEL collector {"new-config-revision": 1, "current-config-revision": 0} 2024-06-18T15:22:02.365Z INFO Running OTEL service {"config-revision": 1} 2024-06-18T15:22:02.365Z ERROR Failed to run OTEL service {"error": "failed to get config: cannot unmarshal the feature configuration: yaml: line 4: did not find expected hexdecimal number", "config-revision": 1} """ Because import on windows OS works only when path contains linux style /, and export only works with paths that have escaped windows style \\.

Collector version

v0.99.0

Environment information

Environment

OS: Windows Server 2022

OpenTelemetry Collector configuration

receivers:
  filelog:
    include:
      ["C:\Users\Administrator\Downloads\test\*.log"]
    exclude:
      ["C:\Users\Administrator\Downloads\test\*exclude.log"]
    start_at: end
    poll_interval: 30s

service:
  pipelines:
    logs:
      receivers:
        - filelog

Log output

2024-06-18T15:22:02.365Z    ERROR   Processing configuration record has failed  {"error": "cannot unmarshal the feature configuration: yaml: line 4: did not find expected hexdecimal number", "feature-name": "filelog"}
2024-06-18T15:22:02.365Z    INFO    Got new configuration revision for OTEL collector   {"new-config-revision": 1, "current-config-revision": 0}
2024-06-18T15:22:02.365Z    INFO    Running OTEL service    {"config-revision": 1}
2024-06-18T15:22:02.365Z    ERROR   Failed to run OTEL service  {"error": "failed to get config: cannot unmarshal the feature configuration: yaml: line 4: did not find expected hexdecimal number", "config-revision": 1}

Additional context

Just to make it clear. This is what you have to currently specify for include and exclude to make it all work on windows os in v99. Using linux style for include, and workaround for exclude.

receivers:
  filelog:
    include:
      ["C:/Users/Administrator/Downloads/test/*.log"]
    exclude:
      ["C:\\Users\\Administrator\\Downloads\\test\\*exclude.log"]
    start_at: end
    poll_interval: 30s

service:
  pipelines:
    logs:
      receivers:
        - filelog
github-actions[bot] commented 3 months ago

Pinging code owners:

djaglowski commented 3 months ago

I'm not sure there's anything to be done here. As you noted, using \\ in place of \ works correctly. This is because \ is a special character that must be escaped in yaml. Quoting the YAML specification: \\ is used to represent the slash.

crobert-1 commented 3 months ago

Agreed with @djaglowski. If it would be helpful we could add some documentation to this receiver to make this functionality more obvious, or a Windows-specific config example to show the required \\ format.

github-actions[bot] commented 1 month ago

This issue has been inactive for 60 days. It will be closed in 60 days if there is no activity. To ping code owners by adding a component label, see Adding Labels via Comments, or if you are unsure of which component this issue relates to, please ping @open-telemetry/collector-contrib-triagers. If this issue is still relevant, please ping the code owners or leave a comment explaining why it is still relevant. Otherwise, please close it.

Pinging code owners:

See Adding Labels via Comments if you do not have permissions to add labels yourself.

pjanotti commented 1 month ago

We should document this is a general observation for Windows when specifying paths directly on the YAML file.

pjanotti commented 4 weeks ago

Actually, after re-reading this @akumetsu183 has a point: / is a valid path separator on Windows, so ["C:/*exclude.log"] for exclusion should work.

The backslash, as already pointed above needs to be escaped as explained above.

djaglowski commented 2 weeks ago

Actually, after re-reading this @akumetsu183 has a point: / is a valid path separator on Windows, so ["C:/*exclude.log"] for exclusion should work.

The backslash, as already pointed above needs to be escaped as explained above.

We're using doublestar.FilepathGlob to find files which appears to work the same regardless of path separator.

But then, we're using doublestar.PathMatch to compare against excluded files. The problem is that this function will automatically use your system's path separator.