open-telemetry / opentelemetry-collector-contrib

Contrib repository for the OpenTelemetry Collector
https://opentelemetry.io
Apache License 2.0
2.75k stars 2.18k forks source link

[receiver/hostmetrics] Process scrape integration test failing #32536

Open crobert-1 opened 2 months ago

crobert-1 commented 2 months ago

Component(s)

receiver/hostmetrics

Describe the issue you're reporting

Failing CI/CD link

Note that this test hasn't been running for some time, as explained in https://github.com/open-telemetry/opentelemetry-collector-contrib/issues/32207, so this failure may not be the result of a recent change.

Failure output:

Running target 'mod-integration-test' in module 'receiver/hostmetricsreceiver' as part of group 'receiver-1'
make --no-print-directory -C receiver/hostmetricsreceiver mod-integration-test
running go integration test ./... in /home/runner/work/opentelemetry-collector-contrib/opentelemetry-collector-contrib/receiver/hostmetricsreceiver
/home/runner/work/opentelemetry-collector-contrib/opentelemetry-collector-contrib/.tools/gotestsum --rerun-fails=1 --packages="./..." -- -race -timeout 360s -parallel 4 -tags=integration,""
go: downloading golang.org/x/exp v0.0.0-20240103183307-be819d1f06fc
go: downloading github.com/klauspost/compress v1.17.2
∅  internal
∅  internal/metadata
∅  internal/perfcounters (1.012s)
✓  internal/scraper/cpuscraper (1.025s)
✓  internal/scraper/cpuscraper/internal/metadata (1.028s)
✖  . (2.35s)
✓  internal/scraper/cpuscraper/ucal (1.014s)
✓  internal/scraper/diskscraper (1.045s)
✓  internal/scraper/diskscraper/internal/metadata (1.046s)
✓  internal/scraper/filesystemscraper (1.026s)
✓  internal/scraper/filesystemscraper/internal/metadata (1.028s)
✓  internal/scraper/loadscraper (1.02s)
✓  internal/scraper/loadscraper/internal/metadata (1.051s)
✓  internal/scraper/memoryscraper (1.022s)
✓  internal/scraper/memoryscraper/internal/metadata (1.036s)
✓  internal/scraper/networkscraper (1.083s)
✓  internal/scraper/pagingscraper (1.024s)
✓  internal/scraper/networkscraper/internal/metadata (1.047s)
✓  internal/scraper/pagingscraper/internal/metadata (1.029s)
✓  internal/scraper/processesscraper (1.2s)
✓  internal/scraper/processesscraper/internal/metadata (1.032s)
✓  internal/scraper/processscraper/internal/handlecount (1.012s)
✓  internal/scraper/processscraper (1.935s)
✓  internal/scraper/processscraper/internal/metadata (1.133s)
✓  internal/scraper/processscraper/ucal (1.012s)

DONE 268 tests, 3 failures in 22.309s

✖  . (1m0.073s)
✖  . (1m0.072s)
✖  . (1m0.051s)

=== Failed
=== FAIL: . Test_ProcessScrape (0.00s)
    scraperint.go:78: 
            Error Trace:    /home/runner/work/opentelemetry-collector-contrib/opentelemetry-collector-contrib/internal/coreinternal/scraperinttest/scraperint.go:78
                                        /home/runner/work/opentelemetry-collector-contrib/opentelemetry-collector-contrib/receiver/hostmetricsreceiver/integration_test.go:55
            Error:          Received unexpected error:
                            host metrics scraper factory not found for key: "process"
            Test:           Test_ProcessScrape
            Messages:       failed creating metrics receiver

=== FAIL: . Test_ProcessScrapeWithCustomRootPath (0.00s)
    scraperint.go:78: 
            Error Trace:    /home/runner/work/opentelemetry-collector-contrib/opentelemetry-collector-contrib/internal/coreinternal/scraperinttest/scraperint.go:78
                                        /home/runner/work/opentelemetry-collector-contrib/opentelemetry-collector-contrib/receiver/hostmetricsreceiver/integration_test.go:86
            Error:          Received unexpected error:
                            host metrics scraper factory not found for key: "process"
            Test:           Test_ProcessScrapeWithCustomRootPath
            Messages:       failed creating metrics receiver

=== FAIL: . Test_ProcessScrapeWithBadRootPathAndEnvVar (0.00s)
    scraperint.go:78: 
            Error Trace:    /home/runner/work/opentelemetry-collector-contrib/opentelemetry-collector-contrib/internal/coreinternal/scraperinttest/scraperint.go:78
                                        /home/runner/work/opentelemetry-collector-contrib/opentelemetry-collector-contrib/receiver/hostmetricsreceiver/integration_test.go:117
            Error:          Received unexpected error:
                            host metrics scraper factory not found for key: "process"
            Test:           Test_ProcessScrapeWithBadRootPathAndEnvVar
            Messages:       failed creating metrics receiver

=== FAIL: . Test_ProcessScrape (re-run 1) (60.03s)
    scraperint.go:112: 
            Error Trace:    /home/runner/work/opentelemetry-collector-contrib/opentelemetry-collector-contrib/internal/coreinternal/scraperinttest/scraperint.go:112
                                        /home/runner/work/opentelemetry-collector-contrib/opentelemetry-collector-contrib/receiver/hostmetricsreceiver/integration_test.go:55
            Error:          Condition never satisfied
            Test:           Test_ProcessScrape
    scraperint.go:94: number of resources doesn't match expected: 2, actual: 1
    scraperint.go:[100](https://github.com/open-telemetry/opentelemetry-collector-contrib/actions/runs/8743660832/job/23994742326?pr=32529#step:5:101): full log:
        Error scraping metrics
        Error scraping metrics
        Error scraping metrics
        Error scraping metrics
        Error scraping metrics
        Error scraping metrics
        Error scraping metrics
        Error scraping metrics
        Error scraping metrics
        Error scraping metrics
        Error scraping metrics
        Error scraping metrics
        Error scraping metrics
        Error scraping metrics
        Error scraping metrics
        Error scraping metrics
        Error scraping metrics
        Error scraping metrics
        Error scraping metrics
        Error scraping metrics
        Error scraping metrics
        Error scraping metrics
        Error scraping metrics
        Error scraping metrics
        Error scraping metrics
        Error scraping metrics
        Error scraping metrics
        Error scraping metrics
        Error scraping metrics
        Error scraping metrics
        Error scraping metrics
        Error scraping metrics
        Error scraping metrics
        Error scraping metrics
        Error scraping metrics
        Error scraping metrics
        Error scraping metrics
        Error scraping metrics
        Error scraping metrics
        Error scraping metrics
        Error scraping metrics
        Error scraping metrics
        Error scraping metrics
        Error scraping metrics
        Error scraping metrics
        Error scraping metrics
        Error scraping metrics
        Error scraping metrics
        Error scraping metrics
        Error scraping metrics
        Error scraping metrics
        Error scraping metrics
        Error scraping metrics
        Error scraping metrics
        Error scraping metrics
        Error scraping metrics
        Error scraping metrics
        Error scraping metrics
        Error scraping metrics
    scraperint.go:[108](https://github.com/open-telemetry/opentelemetry-collector-contrib/actions/runs/8743660832/job/23994742326?pr=32529#step:5:109): latest result:
        resourceMetrics:
          - resource:
              attributes:
                - key: process.pid
                  value:
                    intValue: "19632"
                - key: process.parent_pid
                  value:
                    intValue: "19624"
                - key: process.executable.name
                  value:
                    stringValue: sleep
                - key: process.executable.path
                  value:
                    stringValue: /usr/bin/sleep
                - key: process.command
                  value:
                    stringValue: /bin/sleep
                - key: process.command_line
                  value:
                    stringValue: /bin/sleep 300
                - key: process.owner
                  value:
                    stringValue: runner
            schemaUrl: https://opentelemetry.io/schemas/1.9.0
            scopeMetrics:
              - metrics:
                  - description: Total CPU seconds broken down by different states.
                    name: process.cpu.time
                    sum:
                      aggregationTemporality: 2
                      dataPoints:
                        - asDouble: 0
                          attributes:
                            - key: state
                              value:
                                stringValue: user
                          startTimeUnixNano: "1713469108000000000"
                          timeUnixNano: "1713469168180850681"
                        - asDouble: 0
                          attributes:
                            - key: state
                              value:
                                stringValue: system
                          startTimeUnixNano: "1713469108000000000"
                          timeUnixNano: "1713469168180850681"
                        - asDouble: 0
                          attributes:
                            - key: state
                              value:
                                stringValue: wait
                          startTimeUnixNano: "1713469108000000000"
                          timeUnixNano: "1713469168180850681"
                      isMonotonic: true
                    unit: s
                  - description: Disk bytes transferred.
                    name: process.disk.io
                    sum:
                      aggregationTemporality: 2
                      dataPoints:
                        - asInt: "0"
                          attributes:
                            - key: direction
                              value:
                                stringValue: read
                          startTimeUnixNano: "1713469108000000000"
                          timeUnixNano: "1713469168180850681"
                        - asInt: "0"
                          attributes:
                            - key: direction
                              value:
                                stringValue: write
                          startTimeUnixNano: "1713469108000000000"
                          timeUnixNano: "1713469168180850681"
                      isMonotonic: true
                    unit: By
                  - description: The amount of physical memory in use.
                    name: process.memory.usage
                    sum:
                      aggregationTemporality: 2
                      dataPoints:
                        - asInt: "1966080"
                          startTimeUnixNano: "1713469108000000000"
                          timeUnixNano: "1713469168180850681"
                    unit: By
                  - description: Virtual memory size.
                    name: process.memory.virtual
                    sum:
                      aggregationTemporality: 2
                      dataPoints:
                        - asInt: "6344704"
                          startTimeUnixNano: "1713469108000000000"
                          timeUnixNano: "1713469168180850681"
                    unit: By
                scope:
                  name: otelcol/hostmetricsreceiver/process
                  version: latest

make[2]: *** [../../Makefile.Common:142: mod-integration-test] Error 1
make[1]: *** [Makefile:165: receiver/hostmetricsreceiver] Error 2
make: *** [Makefile:122: gointegration-test] Error 2
=== FAIL: . Test_ProcessScrapeWithCustomRootPath (re-run 1) (60.02s)
    scraperint.go:[112](https://github.com/open-telemetry/opentelemetry-collector-contrib/actions/runs/8743660832/job/23994742326?pr=32529#step:5:113): 
            Error Trace:    /home/runner/work/opentelemetry-collector-contrib/opentelemetry-collector-contrib/internal/coreinternal/scraperinttest/scraperint.go:112
                                        /home/runner/work/opentelemetry-collector-contrib/opentelemetry-collector-contrib/receiver/hostmetricsreceiver/integration_test.go:86
            Error:          Condition never satisfied
            Test:           Test_ProcessScrapeWithCustomRootPath
    scraperint.go:94: number of resources doesn't match expected: 1, actual: 0
    scraperint.go:100: full log:
        Error scraping metrics
        Error scraping metrics
        Error scraping metrics
        Error scraping metrics
        Error scraping metrics
        Error scraping metrics
        Error scraping metrics
        Error scraping metrics
        Error scraping metrics
        Error scraping metrics
        Error scraping metrics
        Error scraping metrics
        Error scraping metrics
        Error scraping metrics
        Error scraping metrics
        Error scraping metrics
        Error scraping metrics
        Error scraping metrics
        Error scraping metrics
        Error scraping metrics
        Error scraping metrics
        Error scraping metrics
        Error scraping metrics
        Error scraping metrics
        Error scraping metrics
        Error scraping metrics
        Error scraping metrics
        Error scraping metrics
        Error scraping metrics
        Error scraping metrics
        Error scraping metrics
        Error scraping metrics
        Error scraping metrics
        Error scraping metrics
        Error scraping metrics
        Error scraping metrics
        Error scraping metrics
        Error scraping metrics
        Error scraping metrics
        Error scraping metrics
        Error scraping metrics
        Error scraping metrics
        Error scraping metrics
        Error scraping metrics
        Error scraping metrics
        Error scraping metrics
        Error scraping metrics
        Error scraping metrics
        Error scraping metrics
        Error scraping metrics
        Error scraping metrics
        Error scraping metrics
        Error scraping metrics
        Error scraping metrics
        Error scraping metrics
        Error scraping metrics
        Error scraping metrics
        Error scraping metrics
        Error scraping metrics
    scraperint.go:108: latest result:
        {}

=== FAIL: . Test_ProcessScrapeWithBadRootPathAndEnvVar (re-run 1) (60.00s)
    scraperint.go:112: 
            Error Trace:    /home/runner/work/opentelemetry-collector-contrib/opentelemetry-collector-contrib/internal/coreinternal/scraperinttest/scraperint.go:112
                                        /home/runner/work/opentelemetry-collector-contrib/opentelemetry-collector-contrib/receiver/hostmetricsreceiver/integration_test.go:[117](https://github.com/open-telemetry/opentelemetry-collector-contrib/actions/runs/8743660832/job/23994742326?pr=32529#step:5:118)
            Error:          Condition never satisfied
            Test:           Test_ProcessScrapeWithBadRootPathAndEnvVar
    scraperint.go:94: number of resources doesn't match expected: 1, actual: 0
    scraperint.go:100: full log:
        Error scraping metrics
        Error scraping metrics
        Error scraping metrics
        Error scraping metrics
        Error scraping metrics
        Error scraping metrics
        Error scraping metrics
        Error scraping metrics
        Error scraping metrics
        Error scraping metrics
        Error scraping metrics
        Error scraping metrics
        Error scraping metrics
        Error scraping metrics
        Error scraping metrics
        Error scraping metrics
        Error scraping metrics
        Error scraping metrics
        Error scraping metrics
        Error scraping metrics
        Error scraping metrics
        Error scraping metrics
        Error scraping metrics
        Error scraping metrics
        Error scraping metrics
        Error scraping metrics
        Error scraping metrics
        Error scraping metrics
        Error scraping metrics
        Error scraping metrics
        Error scraping metrics
        Error scraping metrics
        Error scraping metrics
        Error scraping metrics
        Error scraping metrics
        Error scraping metrics
        Error scraping metrics
        Error scraping metrics
        Error scraping metrics
        Error scraping metrics
        Error scraping metrics
        Error scraping metrics
        Error scraping metrics
        Error scraping metrics
        Error scraping metrics
        Error scraping metrics
        Error scraping metrics
        Error scraping metrics
        Error scraping metrics
        Error scraping metrics
        Error scraping metrics
        Error scraping metrics
        Error scraping metrics
        Error scraping metrics
        Error scraping metrics
        Error scraping metrics
        Error scraping metrics
        Error scraping metrics
        Error scraping metrics
        Error scraping metrics
    scraperint.go:108: latest result:
        {}

DONE 2 runs, 271 tests, 6 failures in 205.959s
make[1]: Leaving directory '/home/runner/work/opentelemetry-collector-contrib/opentelemetry-collector-contrib'
github-actions[bot] commented 2 months ago

Pinging code owners:

See Adding Labels via Comments if you do not have permissions to add labels yourself.

crobert-1 commented 2 months ago

Another process scrape integration test failure. It's a different test but looks like the same general failure output, so I think it makes sense to keep both failures in the same issue for now.

Test_ProcessScrapeWithCustomRootPath Test_ProcessScrapeWithBadRootPathAndEnvVar

crobert-1 commented 2 months ago

+1 freq: https://github.com/open-telemetry/opentelemetry-collector-contrib/actions/runs/8744452408/job/23997406404?pr=32529

bsponge commented 2 months ago

It seems like here's the problem https://github.com/open-telemetry/opentelemetry-collector-contrib/blob/b82cd7f9e3673d3850ddc5b44d4ca7968891d97c/receiver/hostmetricsreceiver/hostmetrics_receiver_test.go#L253.

Tests run with scraperFactories - https://github.com/open-telemetry/opentelemetry-collector-contrib/blob/b82cd7f9e3673d3850ddc5b44d4ca7968891d97c/receiver/hostmetricsreceiver/factory.go#L33 - content prints:

=== RUN   TestGatherMetrics_CreateMetricsScraperError
map[mock:*mock.Mock<0xc00131d090>]
--- PASS: TestGatherMetrics_CreateMetricsScraperError (0.00s)
=== RUN   Test_ProcessScrape
map[mock:*mock.Mock<0xc00131d090>]
    scraperint.go:78: 
            Error Trace:    /home/js/opentelemetry-collector-contrib/internal/coreinternal/scraperinttest/scraperint.go:78
                                        /home/js/opentelemetry-collector-contrib/receiver/hostmetricsreceiver/integration_test.go:55
            Error:          Received unexpected error:
                            host metrics scraper factory not found for key: "process"
            Test:           Test_ProcessScrape
            Messages:       failed creating metrics receiver

Maybe "copying" scraperFactories at the beginning of the test and assigning it back with defer will work - example https://github.com/open-telemetry/opentelemetry-collector-contrib/pull/32583

crobert-1 commented 2 months ago

My apologies, but this is still failing after rebasing on top of the fix.

It looks like the tests addressed in the relevant PR are different tests than what are failing. The PR changed TestGatherMetrics_ScraperKeyConfigError and TestGatherMetrics_CreateMetricsScraperError, but the failing tests are Test_ProcessScrape, Test_ProcessScrapeWithCustomRootPath, and Test_ProcessScrapeWithBadRootPathAndEnvVar.

songy23 commented 2 months ago

+1 freq https://github.com/open-telemetry/opentelemetry-collector-contrib/actions/runs/9080452658/job/24952406635?pr=33033

github-actions[bot] commented 1 day ago

This issue has been inactive for 60 days. It will be closed in 60 days if there is no activity. To ping code owners by adding a component label, see Adding Labels via Comments, or if you are unsure of which component this issue relates to, please ping @open-telemetry/collector-contrib-triagers. If this issue is still relevant, please ping the code owners or leave a comment explaining why it is still relevant. Otherwise, please close it.

Pinging code owners:

See Adding Labels via Comments if you do not have permissions to add labels yourself.