grafana / mimir

Grafana Mimir provides horizontally scalable, highly available, multi-tenant, long-term storage for Prometheus.
https://grafana.com/oss/mimir/
GNU Affero General Public License v3.0
4.17k stars 536 forks source link

Race condition in TestPartitionReader_ConsumeAtStartup #8691

Open pracucci opened 4 months ago

pracucci commented 4 months ago

I've just seen a race condition reported by TestPartitionReader_ConsumeAtStartup (CI):

==================
WARNING: DATA RACE
Write at 0x00c0007f8260 by goroutine 11360:
  github.com/twmb/franz-go/pkg/kgo.(*cursor).setOffset()
      /__w/mimir/mimir/vendor/github.com/twmb/franz-go/pkg/kgo/source.go:208 +0x589
  github.com/twmb/franz-go/pkg/kgo.(*cursor).unset()
      /__w/mimir/mimir/vendor/github.com/twmb/franz-go/pkg/kgo/source.go:183 +0x546
  github.com/twmb/franz-go/pkg/kgo.(*consumer).assignPartitions()
      /__w/mimir/mimir/vendor/github.com/twmb/franz-go/pkg/kgo/consumer.go:1016 +0x53b
  github.com/twmb/franz-go/pkg/kgo.(*Client).close()
      /__w/mimir/mimir/vendor/github.com/twmb/franz-go/pkg/kgo/client.go:1009 +0x264
  github.com/twmb/franz-go/pkg/kgo.(*Client).Close()
      /__w/mimir/mimir/vendor/github.com/twmb/franz-go/pkg/kgo/client.go:993 +0xe4
  github.com/grafana/mimir/pkg/storage/ingest.(*PartitionReader).stopDependencies()
      /__w/mimir/mimir/pkg/storage/ingest/reader.go:167 +0x9f
  github.com/grafana/mimir/pkg/storage/ingest.(*PartitionReader).stop()
      /__w/mimir/mimir/pkg/storage/ingest/reader.go:156 +0x1e7
  github.com/grafana/mimir/pkg/storage/ingest.(*PartitionReader).stop-fm()
      <autogenerated>:1 +0x47
  github.com/grafana/dskit/services.(*BasicService).main()
      /__w/mimir/mimir/vendor/github.com/grafana/dskit/services/basic_service.go:207 +0x558
  github.com/grafana/dskit/services.(*BasicService).StartAsync.func1.gowrap1()
      /__w/mimir/mimir/vendor/github.com/grafana/dskit/services/basic_service.go:119 +0x33

Previous read at 0x00c0007f[82](https://github.com/grafana/mimir/actions/runs/9895618542/job/27335957523?pr=8688#step:8:83)60 by goroutine 12832:
  github.com/twmb/franz-go/pkg/kgo.(*source).handleReqResp.func2()
      /__w/mimir/mimir/vendor/github.com/twmb/franz-go/pkg/kgo/source.go:1090 +0x124
  github.com/twmb/franz-go/pkg/kgo.(*source).handleReqResp()
      /__w/mimir/mimir/vendor/github.com/twmb/franz-go/pkg/kgo/source.go:1121 +0x14b3
  github.com/twmb/franz-go/pkg/kgo.(*source).fetch.func5()
            logging.go:33: level info partition 1 component kafka_client msg assigning partitions why new assignments from direct consumer how assigning everything new, keeping current assignment input test[1{2 e-1 ce0}]
            logging.go:33: level info partition 1 component kafka_client msg received OFFSET_OUT_OF_RANGE on the first fetch, resetting to the configured ConsumeResetOffset broker 0 topic test partition 1 prior_offset 2
            logging.go:33: level info partition 1 component kafka_client msg immediate metadata update triggered why fetch had inner topic errors from broker 0: OFFSET_OUT_OF_RANGE{test[1]}
            logging.go:33: level info partition 1 target_lag 1s max_lag 1s msg partition reader found no records to consume because partition is empty partition_start_offset 2 last_produced_offset 1
            logging.go:33: level info partition 1 target_lag 1s max_lag 1s msg partition reader consumed partition and current lag is lower than configured target consumer lag last_consumed_offset 1 current_lag 0s
            logging.go:33: level info partition 1 msg stopping partition reader
            testing.go:13[98](https://github.com/grafana/mimir/actions/runs/9895618542/job/27335957523?pr=8688#step:8:99): race detected during execution of test
FAIL
FAIL    github.com/grafana/mimir/pkg/storage/ingest 52.182s
zenador commented 3 months ago

Another instance

narqo commented 1 month ago

Although it's not a race, I'll put it here, since it's from the same one. In this build we bumped into a flake in the should_not_wait_indefinitely_if_context_is_cancelled_while_fetching_last_produced_offset:

FAIL: TestPartitionReader_ConsumeAtStartup ``` --- FAIL: TestPartitionReader_ConsumeAtStartup/should_not_wait_indefinitely_if_context_is_cancelled_while_fetching_last_produced_offset (5.10s) logger.go:38: 2024-10-02 09:04:02.06682031 +0000 UTC m=+33.348546147 level info partition 1 msg starting consumption from partition start because no offset has been found start_offset -2 consumer_group test-group logger.go:38: 2024-10-02 09:04:02.068525838 +0000 UTC m=+33.350251676 level info partition 1 component kafka_client msg immediate metadata update triggered why querying metadata for consumer initialization logger.go:38: 2024-10-02 09:04:02.069234555 +0000 UTC m=+33.350960393 level info partition 1 component kafka_client msg assigning partitions why new assignments from direct consumer how assigning everything new, keeping current assignment input test[1{-2 e-1 ce0}] logger.go:38: 2024-10-02 09:04:02.069778519 +0000 UTC m=+33.351504347 level warn partition 1 component kafka_client msg read from broker errored, killing connection after 0 successful responses (is SASL missing?) req ListOffsets addr 127.0.0.1:39977 broker 0 err EOF logger.go:38: 2024-10-02 09:04:02.276416062 +0000 UTC m=+33.558141910 level warn partition 1 component kafka_client msg read from broker errored, killing connection after 0 successful responses (is SASL missing?) req ListOffsets addr 127.0.0.1:39977 broker 0 err EOF logger.go:38: 2024-10-02 09:04:02.332598099 +0000 UTC m=+33.614323927 level warn partition 1 component kafka_client msg read from broker errored, killing connection after 0 successful responses (is SASL missing?) req ListOffsets addr 127.0.0.1:39977 broker 0 err EOF logger.go:38: 2024-10-02 09:04:02.870250129 +0000 UTC m=+34.151975957 level warn partition 1 component kafka_client msg read from broker errored, killing connection after 0 successful responses (is SASL missing?) req ListOffsets addr 127.0.0.1:39977 broker 0 err EOF logger.go:38: 2024-10-02 09:04:02.915102153 +0000 UTC m=+34.196827991 level warn partition 1 component kafka_client msg read from broker errored, killing connection after 0 successful responses (is SASL missing?) req ListOffsets addr 127.0.0.1:39977 broker 0 err EOF logger.go:38: 2024-10-02 09:04:03.071172946 +0000 UTC m=+34.352898804 level warn partition 1 component kafka_client msg read from broker errored, killing connection after 0 successful responses (is SASL missing?) req ListOffsets addr 127.0.0.1:39977 broker 0 err EOF logger.go:38: 2024-10-02 09:04:03.76303884 +0000 UTC m=+35.044764678 level warn partition 1 component kafka_client msg read from broker errored, killing connection after 0 successful responses (is SASL missing?) req ListOffsets addr 127.0.0.1:39977 broker 0 err EOF logger.go:38: 2024-10-02 09:04:03.894623102 +0000 UTC m=+35.176348940 level warn partition 1 component kafka_client msg read from broker errored, killing connection after 0 successful responses (is SASL missing?) req ListOffsets addr 127.0.0.1:39977 broker 0 err EOF logger.go:38: 2024-10-02 09:04:04.072419974 +0000 UTC m=+35.354145822 level warn partition 1 component kafka_client msg read from broker errored, killing connection after 0 successful responses (is SASL missing?) req ListOffsets addr 127.0.0.1:39977 broker 0 err EOF logger.go:38: 2024-10-02 09:04:05.074261759 +0000 UTC m=+36.355987587 level warn partition 1 component kafka_client msg read from broker errored, killing connection after 0 successful responses (is SASL missing?) req ListOffsets addr 127.0.0.1:39977 broker 0 err EOF logger.go:38: 2024-10-02 09:04:05.805081433 +0000 UTC m=+37.086807271 level warn partition 1 component kafka_client msg read from broker errored, killing connection after 0 successful responses (is SASL missing?) req ListOffsets addr 127.0.0.1:39977 broker 0 err EOF logger.go:38: 2024-10-02 09:04:05.907466707 +0000 UTC m=+37.189192545 level warn partition 1 component kafka_client msg read from broker errored, killing connection after 0 successful responses (is SASL missing?) req ListOffsets addr 127.0.0.1:39977 broker 0 err EOF logger.go:38: 2024-10-02 09:04:06.07564007 +0000 UTC m=+37.357365909 level warn partition 1 component kafka_client msg read from broker errored, killing connection after 0 successful responses (is SASL missing?) req ListOffsets addr 127.0.0.1:39977 broker 0 err EOF logger.go:38: 2024-10-02 09:04:07.077075325 +0000 UTC m=+38.358801163 level warn partition 1 component kafka_client msg read from broker errored, killing connection after 0 successful responses (is SASL missing?) req ListOffsets addr 127.0.0.1:39977 broker 0 err EOF reader_test.go:1318: expected Failed, got Starting logger.go:38: 2024-10-02 09:04:08.078455501 +0000 UTC m=+39.360181339 level warn partition 1 component kafka_client msg unable to open connection to broker addr 127.0.0.1:39977 broker 0 err dial tcp 127.0.0.1:39977: connect: connection refused logger.go:38: 2024-10-02 09:04:08.305758644 +0000 UTC m=+39.587484482 level warn partition 1 component kafka_client msg unable to open connection to broker addr 127.0.0.1:39977 broker 0 err dial tcp 127.0.0.1:39977: connect: connection refused logger.go:38: 2024-10-02 09:04:08.305915884 +0000 UTC m=+39.587641712 level warn partition 1 msg failed to fetch the last produced offset err unable to dial: dial tcp 127.0.0.1:39977: connect: connection refused logger.go:38: 2024-10-02 09:04:08.408844219 +0000 UTC m=+39.690570067 level warn partition 1 component kafka_client msg unable to open connection to broker addr 127.0.0.1:39977 broker 0 err dial tcp 127.0.0.1:39977: connect: connection refused logger.go:38: 2024-10-02 09:04:08.409005267 +0000 UTC m=+39.690731125 level warn partition 1 msg failed to fetch the last produced offset err unable to dial: dial tcp 127.0.0.1:39977: connect: connection refused FAIL FAIL github.com/grafana/mimir/pkg/storage/ingest 83.281s ```
codesome commented 1 month ago

Another occurrence https://github.com/grafana/mimir/actions/runs/11188014503/job/31106091993?pr=9515

FAIL: TestPartitionReader_ConsumeAtStartup ``` ================== WARNING: DATA RACE Write at 0x00c00091f160 by goroutine 44209: github.com/twmb/franz-go/pkg/kgo.(*cursor).setOffset() /__w/mimir/mimir/vendor/github.com/twmb/franz-go/pkg/kgo/source.go:284 +0x7a8 github.com/twmb/franz-go/pkg/kgo.(*cursor).unset() /__w/mimir/mimir/vendor/github.com/twmb/franz-go/pkg/kgo/source.go:259 +0x765 github.com/twmb/franz-go/pkg/kgo.(*consumer).assignPartitions() /__w/mimir/mimir/vendor/github.com/twmb/franz-go/pkg/kgo/consumer.go:1025 +0x757 github.com/twmb/franz-go/pkg/kgo.(*Client).RemoveConsumePartitions() /__w/mimir/mimir/vendor/github.com/twmb/franz-go/pkg/kgo/consumer.go:874 +0x544 github.com/grafana/mimir/pkg/storage/ingest.(*PartitionReader).switchToOngoingFetcher() /__w/mimir/mimir/pkg/storage/ingest/reader.go:280 +0x3a4 github.com/grafana/mimir/pkg/storage/ingest.(*PartitionReader).run() /__w/mimir/mimir/pkg/storage/ingest/reader.go:236 +0x3e github.com/grafana/mimir/pkg/storage/ingest.(*PartitionReader).run-fm() :1 +0x47 github.com/grafana/dskit/services.(*BasicService).main() /__w/mimir/mimir/vendor/github.com/grafana/dskit/services/basic_service.go:193 +0x3b7 github.com/grafana/dskit/services.(*BasicService).StartAsync.func1.gowrap1() /__w/mimir/mimir/vendor/github.com/grafana/dskit/services/basic_service.go:122 +0x33 Previous read at 0x00c00091f160 by goroutine 44587: github.com/twmb/franz-go/pkg/kgo.(*source).handleReqResp.func2() /__w/mimir/mimir/vendor/github.com/twmb/franz-go/pkg/kgo/source.go:1229 +0x124 github.com/twmb/franz-go/pkg/kgo.(*source).handleReqResp() /__w/mimir/mimir/vendor/github.com/twmb/franz-go/pkg/kgo/source.go:1260 +0x1501 github.com/twmb/franz-go/pkg/kgo.(*source).fetch.func5() /__w/mimir/mimir/vendor/github.com/twmb/franz-go/pkg/kgo/source.go:934 +0x12c Goroutine 44209 (running) created at: github.com/grafana/dskit/services.(*BasicService).StartAsync.func1() /__w/mimir/mimir/vendor/github.com/grafana/dskit/services/basic_service.go:122 +0x1dc github.com/grafana/dskit/services.(*BasicService).switchState() /__w/mimir/mimir/vendor/github.com/grafana/dskit/services/basic_service.go:142 +0x115 github.com/grafana/dskit/services.(*BasicService).StartAsync() /__w/mimir/mimir/vendor/github.com/grafana/dskit/services/basic_service.go:119 +0xb1 github.com/grafana/mimir/pkg/storage/ingest.(*PartitionReader).StartAsync() :1 +0x63 github.com/grafana/dskit/services.StartAndAwaitRunning() /__w/mimir/mimir/vendor/github.com/grafana/dskit/services/services.go:104 +0x49 github.com/grafana/mimir/pkg/storage/ingest.TestPartitionReader_ConsumeAtStartup.func13.1() /__w/mimir/mimir/pkg/storage/ingest/reader_test.go:1425 +0x1018 testing.tRunner() /usr/local/go/src/testing/testing.go:1690 +0x226 testing.(*T).Run.gowrap1() /usr/local/go/src/testing/testing.go:1743 +0x44 Goroutine 44587 (running) created at: github.com/twmb/franz-go/pkg/kgo.(*source).fetch() /__w/mimir/mimir/vendor/github.com/twmb/franz-go/pkg/kgo/source.go:932 +0xc04 github.com/twmb/franz-go/pkg/kgo.(*source).loopFetch() /__w/mimir/mimir/vendor/github.com/twmb/franz-go/pkg/kgo/source.go:787 +0x4c6 github.com/twmb/franz-go/pkg/kgo.(*source).maybeConsume.gowrap1() /__w/mimir/mimir/vendor/github.com/twmb/franz-go/pkg/kgo/source.go:[71](https://github.com/grafana/mimir/actions/runs/11188014503/job/31106091993?pr=9515#step:8:72)7 +0x33 ================== ================== WARNING: DATA RACE Write at 0x00c00091f168 by goroutine 44625: github.com/twmb/franz-go/pkg/kgo.(*cursor).setOffset() /__w/mimir/mimir/vendor/github.com/twmb/franz-go/pkg/kgo/source.go:284 +0x133d github.com/twmb/franz-go/pkg/kgo.(*consumer).assignPartitions() /__w/mimir/mimir/vendor/github.com/twmb/franz-go/pkg/kgo/consumer.go:1164 +0x12f0 github.com/twmb/franz-go/pkg/kgo.(*consumer).doOnMetadataUpdate.func1() /__w/mimir/mimir/vendor/github.com/twmb/franz-go/pkg/kgo/consumer.go:1208 +0x173 github.com/twmb/franz-go/pkg/kgo.(*consumer).doOnMetadataUpdate.func2() /__w/mimir/mimir/vendor/github.com/twmb/franz-go/pkg/kgo/consumer.go:1220 +0x3a Previous read at 0x00c00091f168 by goroutine 44587: github.com/twmb/franz-go/pkg/kgo.(*source).handleReqResp.func2() /__w/mimir/mimir/vendor/github.com/twmb/franz-go/pkg/kgo/source.go:1229 +0x124 github.com/twmb/franz-go/pkg/kgo.(*source).handleReqResp() /__w/mimir/mimir/vendor/github.com/twmb/franz-go/pkg/kgo/source.go:1260 +0x1501 github.com/twmb/franz-go/pkg/kgo.(*source).fetch.func5() /__w/mimir/mimir/vendor/github.com/twmb/franz-go/pkg/kgo/source.go:934 +0x12c Goroutine 44625 (running) created at: github.com/twmb/franz-go/pkg/kgo.(*consumer).doOnMetadataUpdate() /__w/mimir/mimir/vendor/github.com/twmb/franz-go/pkg/kgo/consumer.go:1217 +0x187 github.com/twmb/franz-go/pkg/kgo.(*Client).updateMetadataLoop() /__w/mimir/mimir/vendor/github.com/twmb/franz-go/pkg/kgo/metadata.go:269 +0x10b4 github.com/twmb/franz-go/pkg/kgo.NewClient.gowrap1() /__w/mimir/mimir/vendor/github.com/twmb/franz-go/pkg/kgo/client.go:516 +0x33 Goroutine 44587 (running) created at: github.com/twmb/franz-go/pkg/kgo.(*source).fetch() /__w/mimir/mimir/vendor/github.com/twmb/franz-go/pkg/kgo/source.go:932 +0xc04 github.com/twmb/franz-go/pkg/kgo.(*source).loopFetch() /__w/mimir/mimir/vendor/github.com/twmb/franz-go/pkg/kgo/source.go:787 +0x4c6 github.com/twmb/franz-go/pkg/kgo.(*source).maybeConsume.gowrap1() /__w/mimir/mimir/vendor/github.com/twmb/franz-go/pkg/kgo/source.go:717 +0x33 ================== --- FAIL: TestPartitionReader_ConsumeAtStartup (0.00s) --- FAIL: TestPartitionReader_ConsumeAtStartup/should_consume_partition_from_last_committed_offset_if_position=last-offset,_and_wait_until_max_lag_is_honored_if_can't_honor_target_lag (6.25s) logging.go:33: level info component kafka_client msg immediate metadata update triggered why forced load because we are producing to a topic for the first time logging.go:33: level info component kafka_client msg producing to a new topic for the first time, fetching metadata to learn its partitions topic test logging.go:33: level info component kafka_client msg done waiting for metadata for new topic topic test logging.go:33: level info component kafka_client msg initializing producer id logging.go:33: level info component kafka_client msg producer id initialization success id 8055687386281055342 epoch 0 reader_test.go:1243: produced 1 record reader_test.go:1229: artificially slowing down OffsetFetch request by 550ms reader_test.go:1258: produced 1 record reader_test.go:1258: produced 1 record reader_test.go:1229: artificially slowing down OffsetFetch request by 550ms reader_test.go:1258: produced 1 record reader_test.go:1258: produced 1 record reader_test.go:1258: produced 1 record reader_test.go:1229: artificially slowing down OffsetFetch request by 550ms reader_test.go:1258: produced 1 record reader_test.go:1258: produced 1 record reader_test.go:1229: artificially slowing down OffsetFetch request by 550ms reader_test.go:1258: produced 1 record reader_test.go:1258: produced 1 record reader_test.go:1229: artificially slowing down OffsetFetch request by 550ms reader_test.go:1258: produced 1 record reader_test.go:1258: produced 1 record reader_test.go:1229: artificially slowing down OffsetFetch request by 550ms reader_test.go:1258: produced 1 record reader_test.go:1258: produced 1 record reader_test.go:1229: artificially slowing down OffsetFetch request by 550ms reader_test.go:1258: produced 1 record reader_test.go:1258: produced 1 record reader_test.go:1229: artificially slowing down OffsetFetch request by 550ms reader_test.go:1258: produced 1 record reader_test.go:1258: produced 1 record reader_test.go:1258: produced 1 record reader_test.go:1229: artificially slowing down OffsetFetch request by 550ms reader_test.go:1258: produced 1 record reader_test.go:1258: produced 1 record reader_test.go:1229: artificially slowing down OffsetFetch request by 550ms reader_test.go:1258: produced 1 record reader_test.go:1258: produced 1 record reader_test.go:1229: artificially slowing down OffsetFetch request by 550ms reader_test.go:1258: produced 1 record reader_test.go:1258: produced 1 record reader_test.go:1229: artificially slowing down OffsetFetch request by 550ms testing.go:1399: race detected during execution of test --- FAIL: TestPartitionReader_ConsumeAtStartup/should_not_wait_indefinitely_if_context_is_cancelled_while_fetching_last_produced_offset (9.40s) logger.go:38: 2024-10-04 22:49:51.676898088 +0000 UTC m=+34.847599356 level info partition 1 msg starting consumption from partition start because no offset has been found start_offset -2 consumer_group test-group logger.go:38: 2024-10-04 22:49:51.678366846 +0000 UTC m=+34.849068104 level info partition 1 component kafka_client msg immediate metadata update triggered why querying metadata for consumer initialization logger.go:38: 2024-10-04 22:49:51.67915597 +0000 UTC m=+34.84985[72](https://github.com/grafana/mimir/actions/runs/11188014503/job/31106091993?pr=9515#step:8:73)38 level info partition 1 component kafka_client msg assigning partitions why new assignments from direct consumer how assigning everything new, keeping current assignment input test[1{-2 e-1 ce0}] logger.go:38: 2024-10-04 22:49:51.679718225 +0000 UTC m=+34.850419483 level warn partition 1 component kafka_client msg read from broker errored, killing connection after 0 successful responses (is SASL missing?) req ListOffsets addr 127.0.0.1:35227 broker 0 err EOF logger.go:38: 2024-10-04 22:49:51.911434318 +0000 UTC m=+35.082135576 level warn partition 1 component kafka_client msg read from broker errored, killing connection after 0 successful responses (is SASL missing?) req ListOffsets addr 127.0.0.1:35227 broker 0 err EOF logger.go:38: 2024-10-04 22:49:51.951836444 +0000 UTC m=+35.122537712 level warn partition 1 component kafka_client msg read from broker errored, killing connection after 0 successful responses (is SASL missing?) req ListOffsets addr 127.0.0.1:35227 broker 0 err EOF logger.go:38: 2024-10-04 22:49:52.377158206 +0000 UTC m=+35.547859474 level warn partition 1 component kafka_client msg read from broker errored, killing connection after 0 successful responses (is SASL missing?) req ListOffsets addr 127.0.0.1:35227 broker 0 err EOF logger.go:38: 2024-10-04 22:49:52.510548242 +0000 UTC m=+35.681249510 level warn partition 1 component kafka_client msg read from broker errored, killing connection after 0 successful responses (is SASL missing?) req ListOffsets addr 127.0.0.1:35227 broker 0 err EOF logger.go:38: 2024-10-04 22:49:52.68063619 +0000 UTC m=+35.851337458 level warn partition 1 component kafka_client msg read from broker errored, killing connection after 0 successful responses (is SASL missing?) req ListOffsets addr 127.0.0.1:35227 broker 0 err EOF logger.go:38: 2024-10-04 22:49:53.486198117 +0000 UTC m=+36.656899375 level warn partition 1 component kafka_client msg read from broker errored, killing connection after 0 successful responses (is SASL missing?) req ListOffsets addr 127.0.0.1:35227 broker 0 err EOF logger.go:38: 2024-10-04 22:49:53.555875287 +0000 UTC m=+36.726576555 level warn partition 1 component kafka_client msg read from broker errored, killing connection after 0 successful responses (is SASL missing?) req ListOffsets addr 127.0.0.1:35227 broker 0 err EOF logger.go:38: 2024-10-04 22:49:53.681898234 +0000 UTC m=+36.852599492 level warn partition 1 component kafka_client msg read from broker errored, killing connection after 0 successful responses (is SASL missing?) req ListOffsets addr 127.0.0.1:35227 broker 0 err EOF logger.go:38: 2024-10-04 22:49:54.68321097 +0000 UTC m=+37.853912238 level warn partition 1 component kafka_client msg read from broker errored, killing connection after 0 successful responses (is SASL missing?) req ListOffsets addr 127.0.0.1:35227 broker 0 err EOF logger.go:38: 2024-10-04 22:49:55.596257648 +0000 UTC m=+38.766958917 level warn partition 1 component kafka_client msg read from broker errored, killing connection after 0 successful responses (is SASL missing?) req ListOffsets addr 127.0.0.1:35227 broker 0 err EOF logger.go:38: 2024-10-04 22:49:55.684383801 +0000 UTC m=+38.855085070 level warn partition 1 component kafka_client msg read from broker errored, killing connection after 0 successful responses (is SASL missing?) req ListOffsets addr 127.0.0.1:35227 broker 0 err EOF logger.go:38: 2024-10-04 22:49:55.959703512 +0000 UTC m=+39.130404781 level warn partition 1 component kafka_client msg read from broker errored, killing connection after 0 successful responses (is SASL missing?) req ListOffsets addr 127.0.0.1:35227 broker 0 err EOF logger.go:38: 2024-10-04 22:49:56.685775563 +0000 UTC m=+39.856476832 level warn partition 1 component kafka_client msg read from broker errored, killing connection after 0 successful responses (is SASL missing?) req ListOffsets addr 127.0.0.1:35227 broker 0 err EOF logger.go:38: 2024-10-04 22:49:57.687191198 +0000 UTC m=+40.857892467 level warn partition 1 component kafka_client msg read from broker errored, killing connection after 0 successful responses (is SASL missing?) req ListOffsets addr 127.0.0.1:35227 broker 0 err EOF logger.go:38: 2024-10-04 22:49:58.097950501 +0000 UTC m=+41.268651769 level warn partition 1 component kafka_client msg read from broker errored, killing connection after 0 successful responses (is SASL missing?) req ListOffsets addr 127.0.0.1:35227 broker 0 err EOF logger.go:38: 2024-10-04 22:49:58.461378582 +0000 UTC m=+41.632079850 level warn partition 1 component kafka_client msg read from broker errored, killing connection after 0 successful responses (is SASL missing?) req ListOffsets addr 127.0.0.1:35227 broker 0 err EOF logger.go:38: 2024-10-04 22:49:58.690084891 +0000 UTC m=+41.860786159 level warn partition 1 component kafka_client msg read from broker errored, killing connection after 0 successful responses (is SASL missing?) req ListOffsets addr 127.0.0.1:35227 broker 0 err EOF logger.go:38: 2024-10-04 22:49:59.691514703 +0000 UTC m=+42.862215971 level warn partition 1 component kafka_client msg read from broker errored, killing connection after 0 successful responses (is SASL missing?) req ListOffsets addr 127.0.0.1:35227 broker 0 err EOF logger.go:38: 2024-10-04 22:50:00.599023338 +0000 UTC m=+43.769724607 level warn partition 1 component kafka_client msg read from broker errored, killing connection after 0 successful responses (is SASL missing?) req ListOffsets addr 127.0.0.1:35227 broker 0 err EOF logger.go:38: 2024-10-04 22:50:00.60165153 +0000 UTC m=+43.772352798 level warn partition 1 msg failed to fetch the last produced offset err broker closed the connection immediately after a request was issued, which happens when SASL is required but not provided: is SASL missing? logger.go:38: 2024-10-04 22:50:00.693131098 +0000 UTC m=+43.863832366 level warn partition 1 component kafka_client msg read from broker errored, killing connection after 0 successful responses (is SASL missing?) req ListOffsets addr 127.0.0.1:35227 broker 0 err EOF logger.go:38: 2024-10-04 22:50:00.962362893 +0000 UTC m=+44.133064161 level warn partition 1 component kafka_client msg read from broker errored, killing connection after 0 successful responses (is SASL missing?) req ListOffsets addr 127.0.0.1:35227 broker 0 err EOF logger.go:38: 2024-10-04 22:50:00.962595951 +0000 UTC m=+44.133297229 level warn partition 1 msg failed to fetch the last produced offset err broker closed the connection immediately after a request was issued, which happens when SASL is required but not provided: is SASL missing? testing.go:1399: race detected during execution of test --- FAIL: TestPartitionReader_ConsumeAtStartup/should_not_wait_indefinitely_if_there_are_no_records_to_consume_from_Kafka_but_partition_start_offset_is_>_0_(e.g._all_previous_records_have_been_deleted_by_Kafka_retention) (0.00s) logger.go:38: 2024-10-04 22:49:56.198617457 +0000 UTC m=+39.369318725 partition 1 fetcher 1 caller log.go:168 level debug msg fetched records duration 6.633659ms start_offset 4 end_offset 6 asked_records 2 got_records 0 diff_records 2 asked_bytes 1000000 got_bytes 0 diff_bytes 1000000 first_timestamp last_timestamp hwm 0 lso 0 err logger.go:38: 2024-10-04 22:49:56.205584681 +0000 UTC m=+39.376285939 partition 1 fetcher 1 caller log.go:168 level debug msg merged fetch result with the next result --- FAIL: TestPartitionReader_ConsumeAtStartup/should_not_wait_indefinitely_if_there_are_no_records_to_consume_from_Kafka_but_partition_start_offset_is_>_0_(e.g._all_previous_records_have_been_deleted_by_Kafka_retention)/consume_from_position:_timestamp (1.03s) logging.go:33: level info component kafka_client msg immediate metadata update triggered why forced load because we are producing to a topic for the first time logging.go:33: level info component kafka_client msg producing to a new topic for the first time, fetching metadata to learn its partitions topic test logging.go:33: level info component kafka_client msg initializing producer id logging.go:33: level info component kafka_client msg done waiting for metadata for new topic topic test logging.go:33: level info component kafka_client msg producer id initialization success id [73](https://github.com/grafana/mimir/actions/runs/11188014503/job/31106091993?pr=9515#step:8:74)29056796108552702 epoch 0 reader_test.go:1398: produced 2 records reader_test.go:1407: fetched partition end offset: 2 reader_test.go:1415: advanced partition start offset to: 2 logger.go:38: 2024-10-04 22:49:56.210704378 +0000 UTC m=+39.381405646 level info partition 1 msg starting consumption from timestamp timestamp 1728082196204 last_consumed_offset 1 start_offset 2 consumer_group test-group logger.go:38: 2024-10-04 22:49:56.212379091 +0000 UTC m=+39.383080359 level info partition 1 component kafka_client msg immediate metadata update triggered why querying metadata for consumer initialization logger.go:38: 2024-10-04 22:49:56.213781242 +0000 UTC m=+39.384482510 level info partition 1 component kafka_client msg assigning partitions why new assignments from direct consumer how assigning everything new, keeping current assignment input test[1{2 e-1 ce0}] logger.go:38: 2024-10-04 22:49:57.21[74](https://github.com/grafana/mimir/actions/runs/11188014503/job/31106091993?pr=9515#step:8:75)91014 +0000 UTC m=+40.388192292 level info partition 1 target_lag 1s max_lag 1s msg partition reader is starting to consume partition until target and max consumer lag is honored logger.go:38: 2024-10-04 22:49:57.21771803 +0000 UTC m=+40.388419298 level info partition 1 msg starting concurrent fetchers start_offset 2 concurrency 2 recordsPerFetch 2 logger.go:38: 2024-10-04 22:49:57.220324061 +0000 UTC m=+40.391025329 level info partition 1 target_lag 1s max_lag 1s msg partition reader found no records to consume because partition is empty partition_start_offset 2 last_produced_offset 1 logger.go:38: 2024-10-04 22:49:57.22038782 +0000 UTC m=+40.391089088 level info partition 1 target_lag 1s max_lag 1s msg partition reader consumed partition and current lag is lower than configured target consumer lag last_consumed_offset 1 current_lag 0s logger.go:38: 2024-10-04 22:49:57.220499199 +0000 UTC m=+40.391200457 level info partition 1 component kafka_client msg assigning partitions why remove of map[test:[1]] requested how unassigning any currently assigned matching partition that is in the input input test[1{0 e0 ce0}] logger.go:38: 2024-10-04 22:49:57.222397652 +0000 UTC m=+40.393098940 level info partition 1 component kafka_client msg immediate metadata update triggered why from AddConsumePartitions logger.go:38: 2024-10-04 22:49:57.223239291 +0000 UTC m=+40.393940559 level info partition 1 component kafka_client msg assigning partitions why new assignments from direct consumer how assigning everything new, keeping current assignment input test[1{2 e-1 ce0}] logger.go:38: 2024-10-04 22:49:57.215802466 +0000 UTC m=+40.386503734 level info partition 1 component kafka_client msg received OFFSET_OUT_OF_RANGE on the first fetch, resetting to the configured ConsumeResetOffset broker 0 topic test partition 1 prior_offset 2 logger.go:38: 2024-10-04 22:49:57.228099361 +0000 UTC m=+40.398800630 partition 1 fetcher 1 caller log.go:168 level debug msg fetched records duration 10.127435ms start_offset 2 end_offset 4 asked_records 2 got_records 0 diff_records 2 asked_bytes 1000000 got_bytes 0 diff_bytes 1000000 first_timestamp last_timestamp hwm 0 lso 0 err logger.go:38: 2024-10-04 22:49:57.22821041 +0000 UTC m=+40.398911678 level info partition 1 msg stopping partition reader logger.go:38: 2024-10-04 22:49:57.228342077 +0000 UTC m=+40.399043345 partition 1 fetcher 0 caller log.go:168 level debug msg fetched records duration 9.77317ms start_offset 4 end_offset 6 asked_records 2 got_records 0 diff_records 2 asked_bytes 1000000 got_bytes 0 diff_bytes 1000000 first_timestamp last_timestamp hwm 0 lso 0 err testing.go:1399: race detected during execution of test --- FAIL: TestPartitionReader_ConsumeAtStartup/should_not_wait_indefinitely_if_there_are_no_records_to_consume_from_Kafka_but_partition_start_offset_is_>_0_(e.g._all_previous_records_have_been_deleted_by_Kafka_retention)/consume_from_position:_end (5.03s) logging.go:33: level info component kafka_client msg producing to a new topic for the first time, fetching metadata to learn its partitions topic test logging.go:33: level info component kafka_client msg immediate metadata update triggered why forced load because we are producing to a topic for the first time logging.go:33: level info component kafka_client msg done waiting for metadata for new topic topic test logging.go:33: level info component kafka_client msg initializing producer id logging.go:33: level info component kafka_client msg producer id initialization success id 1047195296243455181 epoch 0 reader_test.go:1398: produced 2 records reader_test.go:1407: fetched partition end offset: 2 reader_test.go:1415: advanced partition start offset to: 2 logger.go:38: 2024-10-04 22:49:54.458762669 +0000 UTC m=+37.629463927 level info partition 1 msg starting consumption from timestamp timestamp 1728082194452 last_consumed_offset 1 start_offset 2 consumer_group test-group logger.go:38: 2024-10-04 22:49:54.46041991 +0000 UTC m=+37.631121168 level info partition 1 component kafka_client msg immediate metadata update triggered why querying metadata for consumer initialization logger.go:38: 2024-10-04 22:49:54.462035522 +0000 UTC m=+37.632736791 level info partition 1 component kafka_client msg assigning partitions why new assignments from direct consumer how assigning everything new, keeping current assignment input test[1{2 e-1 ce0}] logger.go:38: 2024-10-04 22:49:55.464401677 +0000 UTC m=+38.635102945 level info partition 1 component kafka_client msg received OFFSET_OUT_OF_RANGE on the first fetch, resetting to the configured ConsumeResetOffset broker 0 topic test partition 1 prior_offset 2 logger.go:38: 2024-10-04 22:49:55.464589199 +0000 UTC m=+38.635290467 level info partition 1 component kafka_client msg immediate metadata update triggered why fetch had inner topic errors from broker 0: OFFSET_OUT_OF_RANGE{test[1]} logger.go:38: 2024-10-04 22:49:55.465498716 +0000 UTC m=+38.636199994 level info partition 1 msg starting concurrent fetchers start_offset 2 concurrency 2 recordsPerFetch 2 logger.go:38: 2024-10-04 22:49:55.464902534 +0000 UTC m=+38.635603802 level info partition 1 target_lag 1s max_lag 1s msg partition reader is starting to consume partition until target and max consumer lag is honored logger.go:38: 2024-10-04 22:49:56.46672293 +0000 UTC m=+39.637424198 level info partition 1 component kafka_client msg received OFFSET_OUT_OF_RANGE on the first fetch, resetting to the configured ConsumeResetOffset broker 0 topic test partition 1 prior_offset 2 logger.go:38: 2024-10-04 22:49:56.466899722 +0000 UTC m=+39.637601000 level info partition 1 component kafka_client msg immediate metadata update triggered why fetch had inner topic errors from broker 0: OFFSET_OUT_OF_RANGE{test[1]} logger.go:38: 2024-10-04 22:49:57.46782549 +0000 UTC m=+40.638526[75](https://github.com/grafana/mimir/actions/runs/11188014503/job/31106091993?pr=9515#step:8:76)8 partition 1 fetcher 0 caller log.go:168 level debug msg received an error while fetching records; will retry after processing received records (if any) duration 2.000284732s start_offset 2 end_offset 4 asked_records 2 got_records 0 diff_records 2 asked_bytes 1000000 got_bytes 0 diff_bytes 1000000 first_timestamp last_timestamp hwm 2 lso 2 err OFFSET_OUT_OF_RANGE: The requested offset is not within the range of offsets maintained by the server. logger.go:38: 2024-10-04 22:49:57.467963869 +0000 UTC m=+40.638665137 partition 1 fetcher 0 method concurrentFetcher.fetch.attempt caller log.go:168 level debug log_start_offset 2 start_offset 2 end_offset 4 msg offset out of range; waiting for new records to be produced logger.go:38: 2024-10-04 22:49:58.468389136 +0000 UTC m=+41.639090404 partition 1 fetcher 1 caller log.go:168 level debug msg received an error while fetching records; will retry after processing received records (if any) duration 3.000746787s start_offset 4 end_offset 6 asked_records 2 got_records 0 diff_records 2 asked_bytes 1000000 got_bytes 0 diff_bytes 1000000 first_timestamp last_timestamp hwm 2 lso 2 err OFFSET_OUT_OF_RANGE: The requested offset is not within the range of offsets maintained by the server. logger.go:38: 2024-10-04 22:49:58.468541993 +0000 UTC m=+41.639243281 partition 1 fetcher 1 method concurrentFetcher.fetch.attempt caller log.go:168 level debug log_start_offset 2 start_offset 4 end_offset 6 msg offset out of range; waiting for new records to be produced logger.go:38: 2024-10-04 22:49:59.46866[76](https://github.com/grafana/mimir/actions/runs/11188014503/job/31106091993?pr=9515#step:8:77)69 +0000 UTC m=+42.639368927 partition 1 fetcher 0 caller log.go:168 level debug msg received an error while fetching records; will retry after processing received records (if any) duration 2.000607529s start_offset 2 end_offset 4 asked_records 2 got_records 0 diff_records 2 asked_bytes 1000000 got_bytes 0 diff_bytes 1000000 first_timestamp last_timestamp hwm 2 lso 2 err OFFSET_OUT_OF_RANGE: The requested offset is not within the range of offsets maintained by the server. logger.go:38: 2024-10-04 22:49:59.468797102 +0000 UTC m=+42.639498370 partition 1 fetcher 0 caller log.go:168 level debug msg merged fetch result with the next result logger.go:38: 2024-10-04 22:49:59.468870121 +0000 UTC m=+42.639571389 level info partition 1 target_lag 1s max_lag 1s msg partition reader found no records to consume because partition is empty partition_start_offset 2 last_produced_offset 1 logger.go:38: 2024-10-04 22:49:59.468940944 +0000 UTC m=+42.639642202 level info partition 1 target_lag 1s max_lag 1s msg partition reader consumed partition and current lag is lower than configured target consumer lag last_consumed_offset 1 current_lag 0s logger.go:38: 2024-10-04 22:49:59.468868476 +0000 UTC m=+42.639569744 partition 1 fetcher 0 method concurrentFetcher.fetch.attempt caller log.go:168 level debug log_start_offset 2 start_offset 2 end_offset 4 msg offset out of range; waiting for new records to be produced logger.go:38: 2024-10-04 22:49:59.469023797 +0000 UTC m=+42.639725055 level info partition 1 component kafka_client msg assigning partitions why remove of map[test:[1]] requested how unassigning any currently assigned matching partition that is in the input input test[1{0 e0 ce0}] logger.go:38: 2024-10-04 22:49:59.46941875 +0000 UTC m=+42.640120018 level info partition 1 component kafka_client msg immediate metadata update triggered why from AddConsumePartitions logger.go:38: 2024-10-04 22:49:59.473710903 +0000 UTC m=+42.644412171 partition 1 fetcher 0 caller log.go:168 level debug msg fetched records duration 4.652679ms start_offset 2 end_offset 4 asked_records 2 got_records 0 diff_records 2 asked_bytes 1000000 got_bytes 0 diff_bytes 1000000 first_timestamp last_timestamp hwm 0 lso 0 err logger.go:38: 2024-10-04 22:49:59.473716154 +0000 UTC m=+42.644417412 partition 1 fetcher 1 caller log.go:168 level debug msg fetched records duration 1.005101365s start_offset 4 end_offset 6 asked_records 2 got_records 0 diff_records 2 asked_bytes 1000000 got_bytes 0 diff_bytes 1000000 first_timestamp last_timestamp hwm 0 lso 0 err logger.go:38: 2024-10-04 22:49:59.473892556 +0000 UTC m=+42.644593824 partition 1 fetcher 1 caller log.go:168 level debug msg merged fetch result with the next result logger.go:38: 2024-10-04 22:49:59.473809357 +0000 UTC m=+42.644510626 partition 1 fetcher 0 caller log.go:168 level debug msg merged fetch result with the next result logger.go:38: 2024-10-04 22:49:59.476605896 +0000 UTC m=+42.647307163 level info partition 1 msg stopping partition reader logger.go:38: 2024-10-04 22:49:59.476730199 +0000 UTC m=+42.647431467 level warn partition 1 msg failed to fetch the last produced offset err the internal broker struct chosen to issue this request has died--either the broker id is migrating or no longer exists testing.go:1399: race detected during execution of test ================== WARNING: DATA RACE Read at 0x00c000035563 by goroutine 45240: testing.(*common).logDepth() /usr/local/go/src/testing/testing.go:1024 +0x504 testing.(*common).log() /usr/local/go/src/testing/testing.go:1011 +0x7d testing.(*common).Log() /usr/local/go/src/testing/testing.go:1052 +0x55 testing.(*T).Log() :1 +0x4f github.com/grafana/mimir/pkg/util/test.(*TestingLogger).Log() /__w/mimir/mimir/pkg/util/test/logger.go:38 +0x1[77](https://github.com/grafana/mimir/actions/runs/11188014503/job/31106091993?pr=9515#step:8:78) github.com/go-kit/log.(*context).Log() /__w/mimir/mimir/vendor/github.com/go-kit/log/log.go:168 +0x4ba github.com/grafana/dskit/spanlogger.(*SpanLogger).Log() /__w/mimir/mimir/vendor/github.com/grafana/dskit/spanlogger/spanlogger.go:111 +0x54 github.com/go-kit/log.(*context).Log() /__w/mimir/mimir/vendor/github.com/go-kit/log/log.go:168 +0x4ba github.com/grafana/mimir/pkg/storage/ingest.(*fetchResult).logCompletedFetch() /__w/mimir/mimir/pkg/storage/ingest/fetcher.go:156 +0x123c github.com/grafana/mimir/pkg/storage/ingest.(*concurrentFetchers).fetchSingle.func1() /__w/mimir/mimir/pkg/storage/ingest/fetcher.go:363 +0x94 github.com/grafana/mimir/pkg/storage/ingest.(*concurrentFetchers).fetchSingle.deferwrap1() /__w/mimir/mimir/pkg/storage/ingest/fetcher.go:364 +0x61 runtime.deferreturn() /usr/local/go/src/runtime/panic.go:605 +0x5d github.com/grafana/mimir/pkg/storage/ingest.(*concurrentFetchers).run() /__w/mimir/mimir/pkg/storage/ingest/fetcher.go:497 +0x684 github.com/grafana/mimir/pkg/storage/ingest.(*concurrentFetchers).start.gowrap2() /__w/mimir/mimir/pkg/storage/ingest/fetcher.go:568 +0x[79](https://github.com/grafana/mimir/actions/runs/11188014503/job/31106091993?pr=9515#step:8:80) Previous write at 0x00c000035563 by goroutine 35663: testing.tRunner.func1() /usr/local/go/src/testing/testing.go:1677 +0x8fa runtime.deferreturn() /usr/local/go/src/runtime/panic.go:605 +0x5d testing.(*T).Run.gowrap1() /usr/local/go/src/testing/testing.go:1743 +0x44 Goroutine 45240 (running) created at: github.com/grafana/mimir/pkg/storage/ingest.(*concurrentFetchers).start() /__w/mimir/mimir/pkg/storage/ingest/fetcher.go:568 +0x4f0 github.com/grafana/mimir/pkg/storage/ingest.newConcurrentFetchers.gowrap1() /__w/mimir/mimir/pkg/storage/ingest/fetcher.go:293 +0x79 Goroutine 35663 (finished) created at: testing.(*T).Run() /usr/local/go/src/testing/testing.go:1743 +0x[82](https://github.com/grafana/mimir/actions/runs/11188014503/job/31106091993?pr=9515#step:8:83)5 github.com/grafana/mimir/pkg/storage/ingest.TestPartitionReader_ConsumeAtStartup() /__w/mimir/mimir/pkg/storage/ingest/reader_test.go:1367 +0x627 testing.tRunner() /usr/local/go/src/testing/testing.go:1690 +0x226 testing.(*T).Run.gowrap1() /usr/local/go/src/testing/testing.go:1743 +0x44 ================== FAIL FAIL github.com/grafana/mimir/pkg/storage/ingest [87](https://github.com/grafana/mimir/actions/runs/11188014503/job/31106091993?pr=9515#step:8:88).471s ```
zenador commented 1 month ago

https://github.com/grafana/mimir/actions/runs/11267325901/job/31332271953?pr=9527

Details

``` ================= WARNING: DATA RACE Write at 0x00c0007927e0 by goroutine 46254: github.com/twmb/franz-go/pkg/kgo.(*cursor).setOffset() /__w/mimir/mimir/vendor/github.com/twmb/franz-go/pkg/kgo/source.go:284 +0x7a8 github.com/twmb/franz-go/pkg/kgo.(*cursor).unset() /__w/mimir/mimir/vendor/github.com/twmb/franz-go/pkg/kgo/source.go:259 +0x765 github.com/twmb/franz-go/pkg/kgo.(*consumer).assignPartitions() /__w/mimir/mimir/vendor/github.com/twmb/franz-go/pkg/kgo/consumer.go:1025 +0x757 github.com/twmb/franz-go/pkg/kgo.(*Client).RemoveConsumePartitions() /__w/mimir/mimir/vendor/github.com/twmb/franz-go/pkg/kgo/consumer.go:874 +0x544 github.com/grafana/mimir/pkg/storage/ingest.(*PartitionReader).switchToOngoingFetcher() /__w/mimir/mimir/pkg/storage/ingest/reader.go:280 +0x3a4 github.com/grafana/mimir/pkg/storage/ingest.(*PartitionReader).run() /__w/mimir/mimir/pkg/storage/ingest/reader.go:236 +0x3e github.com/grafana/mimir/pkg/storage/ingest.(*PartitionReader).run-fm() :1 +0x47 github.com/grafana/dskit/services.(*BasicService).main() /__w/mimir/mimir/vendor/github.com/grafana/dskit/services/basic_service.go:193 +0x3b7 github.com/grafana/dskit/services.(*BasicService).StartAsync.func1.gowrap1() /__w/mimir/mimir/vendor/github.com/grafana/dskit/services/basic_service.go:122 +0x33 Previous read at 0x00c0007927e0 by goroutine 47610: github.com/twmb/franz-go/pkg/kgo.(*source).handleReqResp.func2() /__w/mimir/mimir/vendor/github.com/twmb/franz-go/pkg/kgo/source.go:1229 +0x124 github.com/twmb/franz-go/pkg/kgo.(*source).handleReqResp() /__w/mimir/mimir/vendor/github.com/twmb/franz-go/pkg/kgo/source.go:1260 +0x1501 github.com/twmb/franz-go/pkg/kgo.(*source).fetch.func5() /__w/mimir/mimir/vendor/github.com/twmb/franz-go/pkg/kgo/source.go:934 +0x12c Goroutine 46254 (running) created at: github.com/grafana/dskit/services.(*BasicService).StartAsync.func1() /__w/mimir/mimir/vendor/github.com/grafana/dskit/services/basic_service.go:122 +0x1dc github.com/grafana/dskit/services.(*BasicService).switchState() /__w/mimir/mimir/vendor/github.com/grafana/dskit/services/basic_service.go:142 +0x115 github.com/grafana/dskit/services.(*BasicService).StartAsync() /__w/mimir/mimir/vendor/github.com/grafana/dskit/services/basic_service.go:119 +0xb1 github.com/grafana/mimir/pkg/storage/ingest.(*PartitionReader).StartAsync() :1 +0x63 github.com/grafana/dskit/services.StartAndAwaitRunning() /__w/mimir/mimir/vendor/github.com/grafana/dskit/services/services.go:104 +0x49 github.com/grafana/mimir/pkg/storage/ingest.TestPartitionReader_ConsumeAtStartup.func13.1() /__w/mimir/mimir/pkg/storage/ingest/reader_test.go:1425 +0x1018 testing.tRunner() /usr/local/go/src/testing/testing.go:1690 +0x226 testing.(*T).Run.gowrap1() /usr/local/go/src/testing/testing.go:1743 +0x44 Goroutine 47610 (finished) created at: github.com/twmb/franz-go/pkg/kgo.(*source).fetch() /__w/mimir/mimir/vendor/github.com/twmb/franz-go/pkg/kgo/source.go:932 +0xc04 github.com/twmb/franz-go/pkg/kgo.(*source).loopFetch() /__w/mimir/mimir/vendor/github.com/twmb/franz-go/pkg/kgo/source.go:787 +0x4c6 github.com/twmb/franz-go/pkg/kgo.(*source).maybeConsume.gowrap1() /__w/mimir/mimir/vendor/github.com/twmb/franz-go/pkg/kgo/source.go:717 +0x33 ================== --- FAIL: TestPartitionReader_ConsumeAtStartup (0.00s) --- FAIL: TestPartitionReader_ConsumeAtStartup/should_consume_partition_from_last_committed_offset_if_position=last-offset,_and_wait_until_max_lag_is_honored_if_can't_honor_target_lag (6.10s) logging.go:33: level info component kafka_client msg immediate metadata update triggered why forced load because we are producing to a topic for the first time logging.go:33: level info component kafka_client msg producing to a new topic for the first time, fetching metadata to learn its partitions topic test logging.go:33: level info component kafka_client msg initializing producer id logging.go:33: level info component kafka_client msg done waiting for metadata for new topic topic test logging.go:33: level info component kafka_client msg producer id initialization success id 4006748771009[72](https://github.com/grafana/mimir/actions/runs/11267325901/job/31332271953?pr=9527#step:8:73)7914 epoch 0 reader_test.go:1243: produced 1 record reader_test.go:1229: artificially slowing down OffsetFetch request by 550ms reader_test.go:1258: produced 1 record reader_test.go:1258: produced 1 record reader_test.go:1229: artificially slowing down OffsetFetch request by 550ms reader_test.go:1258: produced 1 record reader_test.go:1258: produced 1 record reader_test.go:1258: produced 1 record reader_test.go:1229: artificially slowing down OffsetFetch request by 550ms reader_test.go:1258: produced 1 record reader_test.go:1258: produced 1 record reader_test.go:1229: artificially slowing down OffsetFetch request by 550ms reader_test.go:1258: produced 1 record reader_test.go:1258: produced 1 record reader_test.go:1229: artificially slowing down OffsetFetch request by 550ms reader_test.go:1258: produced 1 record reader_test.go:1258: produced 1 record reader_test.go:1229: artificially slowing down OffsetFetch request by 550ms reader_test.go:1258: produced 1 record reader_test.go:1258: produced 1 record reader_test.go:1229: artificially slowing down OffsetFetch request by 550ms reader_test.go:1258: produced 1 record reader_test.go:1258: produced 1 record reader_test.go:1229: artificially slowing down OffsetFetch request by 550ms reader_test.go:1258: produced 1 record reader_test.go:1258: produced 1 record reader_test.go:1258: produced 1 record reader_test.go:1229: artificially slowing down OffsetFetch request by 550ms reader_test.go:1258: produced 1 record reader_test.go:1258: produced 1 record reader_test.go:1229: artificially slowing down OffsetFetch request by 550ms reader_test.go:1229: artificially slowing down OffsetFetch request by 550ms reader_test.go:1258: produced 1 record reader_test.go:1258: produced 1 record reader_test.go:1258: produced 1 record reader_test.go:1229: artificially slowing down OffsetFetch request by 550ms reader_test.go:1258: produced 1 record testing.go:1399: race detected during execution of test --- FAIL: TestPartitionReader_ConsumeAtStartup/should_not_wait_indefinitely_if_context_is_cancelled_while_fetching_last_produced_offset (8.93s) logger.go:38: 2024-10-10 04:16:53.606977776 +0000 UTC m=+34.053[73](https://github.com/grafana/mimir/actions/runs/11267325901/job/31332271953?pr=9527#step:8:74)7144 level info partition 1 msg starting consumption from partition start because no offset has been found start_offset -2 consumer_group test-group logger.go:38: 2024-10-10 04:16:53.60852153 +0000 UTC m=+34.055280888 level info partition 1 component kafka_client msg immediate metadata update triggered why querying metadata for consumer initialization logger.go:38: 2024-10-10 04:16:53.60929543 +0000 UTC m=+34.056054798 level info partition 1 component kafka_client msg assigning partitions why new assignments from direct consumer how assigning everything new, keeping current assignment input test[1{-2 e-1 ce0}] logger.go:38: 2024-10-10 04:16:53.609802089 +0000 UTC m=+34.056561467 level warn partition 1 component kafka_client msg read from broker errored, killing connection after 0 successful responses (is SASL missing?) req ListOffsets addr 127.0.0.1:38907 broker 0 err read tcp 127.0.0.1:58428->127.0.0.1:38907: read: connection reset by peer logger.go:38: 2024-10-10 04:16:53.855954002 +0000 UTC m=+34.302713360 level warn partition 1 component kafka_client msg read from broker errored, killing connection after 0 successful responses (is SASL missing?) req ListOffsets addr 127.0.0.1:38907 broker 0 err EOF logger.go:38: 2024-10-10 04:16:53.886681183 +0000 UTC m=+34.333440551 level warn partition 1 component kafka_client msg read from broker errored, killing connection after 0 successful responses (is SASL missing?) req ListOffsets addr 127.0.0.1:38907 broker 0 err EOF logger.go:38: 2024-10-10 04:16:54.292218019 +0000 UTC m=+34.738977377 level warn partition 1 component kafka_client msg read from broker errored, killing connection after 0 successful responses (is SASL missing?) req ListOffsets addr 127.0.0.1:38907 broker 0 err EOF logger.go:38: 2024-10-10 04:16:54.421562839 +0000 UTC m=+34.868322227 level warn partition 1 component kafka_client msg read from broker errored, killing connection after 0 successful responses (is SASL missing?) req ListOffsets addr 127.0.0.1:38907 broker 0 err EOF logger.go:38: 2024-10-10 04:16:54.610[74](https://github.com/grafana/mimir/actions/runs/11267325901/job/31332271953?pr=9527#step:8:75)7307 +0000 UTC m=+35.057506675 level warn partition 1 component kafka_client msg read from broker errored, killing connection after 0 successful responses (is SASL missing?) req ListOffsets addr 127.0.0.1:38907 broker 0 err EOF logger.go:38: 2024-10-10 04:16:55.181056708 +0000 UTC m=+35.627816076 level warn partition 1 component kafka_client msg read from broker errored, killing connection after 0 successful responses (is SASL missing?) req ListOffsets addr 127.0.0.1:38907 broker 0 err EOF logger.go:38: 2024-10-10 04:16:55.36460113 +0000 UTC m=+35.811360488 level warn partition 1 component kafka_client msg read from broker errored, killing connection after 0 successful responses (is SASL missing?) req ListOffsets addr 127.0.0.1:38907 broker 0 err EOF logger.go:38: 2024-10-10 04:16:55.611916391 +0000 UTC m=+36.0586[75](https://github.com/grafana/mimir/actions/runs/11267325901/job/31332271953?pr=9527#step:8:76)759 level warn partition 1 component kafka_client msg read from broker errored, killing connection after 0 successful responses (is SASL missing?) req ListOffsets addr 127.0.0.1:38907 broker 0 err EOF logger.go:38: 2024-10-10 04:16:56.613682244 +0000 UTC m=+37.060441612 level warn partition 1 component kafka_client msg read from broker errored, killing connection after 0 successful responses (is SASL missing?) req ListOffsets addr 127.0.0.1:38907 broker 0 err EOF logger.go:38: 2024-10-10 04:16:57.094000418 +0000 UTC m=+37.5407597[76](https://github.com/grafana/mimir/actions/runs/11267325901/job/31332271953?pr=9527#step:8:77) level warn partition 1 component kafka_client msg read from broker errored, killing connection after 0 successful responses (is SASL missing?) req ListOffsets addr 127.0.0.1:38907 broker 0 err EOF logger.go:38: 2024-10-10 04:16:57.396594373 +0000 UTC m=+37.843353741 level warn partition 1 component kafka_client msg read from broker errored, killing connection after 0 successful responses (is SASL missing?) req ListOffsets addr 127.0.0.1:38907 broker 0 err EOF logger.go:38: 2024-10-10 04:16:57.615222887 +0000 UTC m=+38.061982255 level warn partition 1 component kafka_client msg read from broker errored, killing connection after 0 successful responses (is SASL missing?) req ListOffsets addr 127.0.0.1:38907 broker 0 err EOF logger.go:38: 2024-10-10 04:16:58.616429473 +0000 UTC m=+39.063188831 level warn partition 1 component kafka_client msg read from broker errored, killing connection after 0 successful responses (is SASL missing?) req ListOffsets addr 127.0.0.1:38907 broker 0 err EOF logger.go:38: 2024-10-10 04:16:59.595831082 +0000 UTC m=+40.042590450 level warn partition 1 component kafka_client msg read from broker errored, killing connection after 0 successful responses (is SASL missing?) req ListOffsets addr 127.0.0.1:38907 broker 0 err EOF logger.go:38: 2024-10-10 04:16:59.61804281 +0000 UTC m=+40.064802189 level warn partition 1 component kafka_client msg read from broker errored, killing connection after 0 successful responses (is SASL missing?) req ListOffsets addr 127.0.0.1:38907 broker 0 err EOF logger.go:38: 2024-10-10 04:16:59.897691713 +0000 UTC m=+40.344451071 level warn partition 1 component kafka_client msg read from broker errored, killing connection after 0 successful responses (is SASL missing?) req ListOffsets addr 127.0.0.1:38907 broker 0 err EOF logger.go:38: 2024-10-10 04:17:00.619479618 +0000 UTC m=+41.066238986 level warn partition 1 component kafka_client msg read from broker errored, killing connection after 0 successful responses (is SASL missing?) req ListOffsets addr 127.0.0.1:38907 broker 0 err EOF logger.go:38: 2024-10-10 04:17:01.621108936 +0000 UTC m=+42.067868323 level warn partition 1 component kafka_client msg read from broker errored, killing connection after 0 successful responses (is SASL missing?) req ListOffsets addr 127.0.0.1:38907 broker 0 err EOF logger.go:38: 2024-10-10 04:17:02.097145895 +0000 UTC m=+42.543905253 level warn partition 1 component kafka_client msg read from broker errored, killing connection after 0 successful responses (is SASL missing?) req ListOffsets addr 127.0.0.1:38907 broker 0 err EOF logger.go:38: 2024-10-10 04:17:02.097325[77](https://github.com/grafana/mimir/actions/runs/11267325901/job/31332271953?pr=9527#step:8:78)2 +0000 UTC m=+42.544085140 level warn partition 1 msg failed to fetch the last produced offset err broker closed the connection immediately after a request was issued, which happens when SASL is required but not provided: is SASL missing? logger.go:38: 2024-10-10 04:17:02.399074166 +0000 UTC m=+42.845833524 level warn partition 1 component kafka_client msg read from broker errored, killing connection after 0 successful responses (is SASL missing?) req ListOffsets addr 127.0.0.1:38907 broker 0 err EOF logger.go:38: 2024-10-10 04:17:02.399246449 +0000 UTC m=+42.846005827 level warn partition 1 msg failed to fetch the last produced offset err broker closed the connection immediately after a request was issued, which happens when SASL is required but not provided: is SASL missing? testing.go:1399: race detected during execution of test --- FAIL: TestPartitionReader_ConsumeAtStartup/should_not_wait_indefinitely_if_there_are_no_records_to_consume_from_Kafka_but_partition_start_offset_is_>_0_(e.g._all_previous_records_have_been_deleted_by_Kafka_retention) (0.00s) --- FAIL: TestPartitionReader_ConsumeAtStartup/should_not_wait_indefinitely_if_there_are_no_records_to_consume_from_Kafka_but_partition_start_offset_is_>_0_(e.g._all_previous_records_have_been_deleted_by_Kafka_retention)/consume_from_position:_timestamp (4.02s) logging.go:33: level info component kafka_client msg producing to a new topic for the first time, fetching metadata to learn its partitions topic test logging.go:33: level info component kafka_client msg immediate metadata update triggered why forced load because we are producing to a topic for the first time logging.go:33: level info component kafka_client msg done waiting for metadata for new topic topic test logging.go:33: level info component kafka_client msg initializing producer id logging.go:33: level info component kafka_client msg producer id initialization success id 3033693887431518137 epoch 0 reader_test.go:1398: produced 2 records reader_test.go:1407: fetched partition end offset: 2 reader_test.go:1415: advanced partition start offset to: 2 logger.go:38: 2024-10-10 04:16:55.275439755 +0000 UTC m=+35.722199124 level info partition 1 msg starting consumption from timestamp timestamp 1728533815272 last_consumed_offset 1 start_offset 2 consumer_group test-group logger.go:38: 2024-10-10 04:16:55.279723549 +0000 UTC m=+35.726482917 level info partition 1 component kafka_client msg immediate metadata update triggered why querying metadata for consumer initialization logger.go:38: 2024-10-10 04:16:55.280457384 +0000 UTC m=+35.727216752 level info partition 1 component kafka_client msg assigning partitions why new assignments from direct consumer how assigning everything new, keeping current assignment input test[1{2 e-1 ce0}] logger.go:38: 2024-10-10 04:16:56.282164427 +0000 UTC m=+36.728923795 level info partition 1 component kafka_client msg received OFFSET_OUT_OF_RANGE on the first fetch, resetting to the configured ConsumeResetOffset broker 0 topic test partition 1 prior_offset 2 logger.go:38: 2024-10-10 04:16:56.2823 +0000 UTC m=+36.729059369 level info partition 1 component kafka_client msg immediate metadata update triggered why fetch had inner topic errors from broker 0: OFFSET_OUT_OF_RANGE{test[1]} logger.go:38: 2024-10-10 04:16:56.283082546 +0000 UTC m=+36.729841904 level info partition 1 target_lag 1s max_lag 1s msg partition reader is starting to consume partition until target and max consumer lag is honored logger.go:38: 2024-10-10 04:16:56.28320188 +0000 UTC m=+36.7299612[78](https://github.com/grafana/mimir/actions/runs/11267325901/job/31332271953?pr=9527#step:8:79) level info partition 1 msg starting concurrent fetchers start_offset 2 concurrency 2 recordsPerFetch 2 logger.go:38: 2024-10-10 04:16:57.283788296 +0000 UTC m=+37.730547664 partition 1 fetcher 0 caller log.go:168 level debug msg received an error while fetching records; will retry after processing received records (if any) duration 1.000489484s start_offset 2 end_offset 4 asked_records 2 got_records 0 diff_records 2 asked_bytes 1000000 got_bytes 0 diff_bytes 1000000 first_timestamp last_timestamp hwm 2 lso 2 err OFFSET_OUT_OF_RANGE: The requested offset is not within the range of offsets maintained by the server. logger.go:38: 2024-10-10 04:16:57.283872724 +0000 UTC m=+37.730632092 partition 1 fetcher 0 method concurrentFetcher.fetch.attempt caller log.go:168 level debug log_start_offset 2 start_offset 2 end_offset 4 msg offset out of range; waiting for new records to be produced logger.go:38: 2024-10-10 04:16:58.284821546 +0000 UTC m=+38.731580924 partition 1 fetcher 1 caller log.go:168 level debug msg received an error while fetching records; will retry after processing received records (if any) duration 2.001451842s start_offset 4 end_offset 6 asked_records 2 got_records 0 diff_records 2 asked_bytes 1000000 got_bytes 0 diff_bytes 1000000 first_timestamp last_timestamp hwm 2 lso 2 err OFFSET_OUT_OF_RANGE: The requested offset is not within the range of offsets maintained by the server. logger.go:38: 2024-10-10 04:16:58.284972299 +0000 UTC m=+38.731731667 partition 1 fetcher 1 method concurrentFetcher.fetch.attempt caller log.go:168 level debug log_start_offset 2 start_offset 4 end_offset 6 msg offset out of range; waiting for new records to be produced logger.go:38: 2024-10-10 04:16:59.285415824 +0000 UTC m=+39.732175182 level info partition 1 target_lag 1s max_lag 1s msg partition reader found no records to consume because partition is empty partition_start_offset 2 last_produced_offset 1 logger.go:38: 2024-10-10 04:16:59.28550985 +0000 UTC m=+39.732269208 level info partition 1 target_lag 1s max_lag 1s msg partition reader consumed partition and current lag is lower than configured target consumer lag last_consumed_offset 1 current_lag 0s logger.go:38: 2024-10-10 04:16:59.285598536 +0000 UTC m=+39.73235[79](https://github.com/grafana/mimir/actions/runs/11267325901/job/31332271953?pr=9527#step:8:80)14 level info partition 1 component kafka_client msg assigning partitions why remove of map[test:[1]] requested how unassigning any currently assigned matching partition that is in the input input test[1{0 e0 ce0}] logger.go:38: 2024-10-10 04:16:59.28542915 +0000 UTC m=+39.732188518 level info partition 1 component kafka_client msg received OFFSET_OUT_OF_RANGE on the first fetch, resetting to the configured ConsumeResetOffset broker 0 topic test partition 1 prior_offset 2 logger.go:38: 2024-10-10 04:16:59.287273636 +0000 UTC m=+39.734033003 level info partition 1 component kafka_client msg immediate metadata update triggered why from AddConsumePartitions logger.go:38: 2024-10-10 04:16:59.288731507 +0000 UTC m=+39.735490885 partition 1 fetcher 1 caller log.go:168 level debug msg fetched records duration 1.003667245s start_offset 4 end_offset 6 asked_records 2 got_records 0 diff_records 2 asked_bytes 1000000 got_bytes 0 diff_bytes 1000000 first_timestamp last_timestamp hwm 0 lso 0 err logger.go:38: 2024-10-10 04:16:59.288827636 +0000 UTC m=+39.735587005 partition 1 fetcher 1 caller log.go:168 level debug msg merged fetch result with the next result logger.go:38: 2024-10-10 04:16:59.28894696 +0000 UTC m=+39.735706318 level info partition 1 msg stopping partition reader logger.go:38: 2024-10-10 04:16:59.289101199 +0000 UTC m=+39.735860567 partition 1 fetcher 0 caller log.go:168 level debug msg fetched records duration 2.005147243s start_offset 2 end_offset 4 asked_records 2 got_records 0 diff_records 2 asked_bytes 1000000 got_bytes 0 diff_bytes 1000000 first_timestamp last_timestamp hwm 0 lso 0 err logger.go:38: 2024-10-10 04:16:59.289181058 +0000 UTC m=+39.735940427 partition 1 fetcher 0 caller log.go:168 level debug msg merged fetch result with the next result logger.go:38: 2024-10-10 04:16:59.289522999 +0000 UTC m=+39.736282397 level warn partition 1 msg failed to fetch the last produced offset err context canceled testing.go:1399: race detected during execution of test --- FAIL: TestPartitionReader_ConsumeAtStartup/should_not_wait_indefinitely_if_there_are_no_records_to_consume_from_Kafka_but_partition_start_offset_is_>_0_(e.g._all_previous_records_have_been_deleted_by_Kafka_retention)/consume_from_position:_end (4.02s) logging.go:33: level info component kafka_client msg immediate metadata update triggered why forced load because we are producing to a topic for the first time logging.go:33: level info component kafka_client msg producing to a new topic for the first time, fetching metadata to learn its partitions topic test logging.go:33: level info component kafka_client msg done waiting for metadata for new topic topic test logging.go:33: level info component kafka_client msg initializing producer id logging.go:33: level info component kafka_client msg producer id initialization success id 8262787052353573940 epoch 0 reader_test.go:1398: produced 2 records reader_test.go:1407: fetched partition end offset: 2 reader_test.go:1415: advanced partition start offset to: 2 logger.go:38: 2024-10-10 04:16:57.297[80](https://github.com/grafana/mimir/actions/runs/11267325901/job/31332271953?pr=9527#step:8:81)7347 +0000 UTC m=+37.744566716 level info partition 1 msg starting consumption from timestamp timestamp 1728533[81](https://github.com/grafana/mimir/actions/runs/11267325901/job/31332271953?pr=9527#step:8:82)7294 last_consumed_offset 1 start_offset 2 consumer_group test-group logger.go:38: 2024-10-10 04:16:57.29944261 +0000 UTC m=+37.746201979 level info partition 1 component kafka_client msg immediate metadata update triggered why querying metadata for consumer initialization logger.go:38: 2024-10-10 04:16:57.30111883 +0000 UTC m=+37.74787[82](https://github.com/grafana/mimir/actions/runs/11267325901/job/31332271953?pr=9527#step:8:83)09 level info partition 1 component kafka_client msg assigning partitions why new assignments from direct consumer how assigning everything new, keeping current assignment input test[1{2 e-1 ce0}] logger.go:38: 2024-10-10 04:16:58.303079619 +0000 UTC m=+38.749[83](https://github.com/grafana/mimir/actions/runs/11267325901/job/31332271953?pr=9527#step:8:84)8987 level info partition 1 component kafka_client msg received OFFSET_OUT_OF_RANGE on the first fetch, resetting to the configured ConsumeResetOffset broker 0 topic test partition 1 prior_offset 2 logger.go:38: 2024-10-10 04:16:58.303231764 +0000 UTC m=+38.749991132 level info partition 1 component kafka_client msg immediate metadata update triggered why fetch had inner topic errors from broker 0: OFFSET_OUT_OF_RANGE{test[1]} logger.go:38: 2024-10-10 04:16:58.304303873 +0000 UTC m=+38.751063231 level info partition 1 target_lag 1s max_lag 1s msg partition reader is starting to consume partition until target and max consumer lag is honored logger.go:38: 2024-10-10 04:16:58.304415171 +0000 UTC m=+38.751174539 level info partition 1 msg starting concurrent fetchers start_offset 2 concurrency 2 recordsPerFetch 2 logger.go:38: 2024-10-10 04:16:59.305628259 +0000 UTC m=+39.752387617 partition 1 fetcher 0 caller log.go:168 level debug msg received an error while fetching records; will retry after processing received records (if any) duration 1.000967788s start_offset 4 end_offset 6 asked_records 2 got_records 0 diff_records 2 asked_bytes 1000000 got_bytes 0 diff_bytes 1000000 first_timestamp last_timestamp hwm 2 lso 2 err OFFSET_OUT_OF_RANGE: The requested offset is not within the range of offsets maintained by the server. logger.go:38: 2024-10-10 04:16:59.305724449 +0000 UTC m=+39.752483817 partition 1 fetcher 0 method concurrentFetcher.fetch.attempt caller log.go:168 level debug log_start_offset 2 start_offset 4 end_offset 6 msg offset out of range; waiting for new records to be produced logger.go:38: 2024-10-10 04:17:00.3065152[86](https://github.com/grafana/mimir/actions/runs/11267325901/job/31332271953?pr=9527#step:8:87) +0000 UTC m=+40.753274654 partition 1 fetcher 1 caller log.go:168 level debug msg received an error while fetching records; will retry after processing received records (if any) duration 2.001992864s start_offset 2 end_offset 4 asked_records 2 got_records 0 diff_records 2 asked_bytes 1000000 got_bytes 0 diff_bytes 1000000 first_timestamp last_timestamp hwm 2 lso 2 err OFFSET_OUT_OF_RANGE: The requested offset is not within the range of offsets maintained by the server. logger.go:38: 2024-10-10 04:17:00.306617558 +0000 UTC m=+40.753376936 partition 1 fetcher 1 method concurrentFetcher.fetch.attempt caller log.go:168 level debug log_start_offset 2 start_offset 2 end_offset 4 msg offset out of range; waiting for new records to be produced logger.go:38: 2024-10-10 04:17:01.307463277 +0000 UTC m=+41.754222655 level info partition 1 component kafka_client msg received OFFSET_OUT_OF_RANGE on the first fetch, resetting to the configured ConsumeResetOffset broker 0 topic test partition 1 prior_offset 2 logger.go:38: 2024-10-10 04:17:01.307620412 +0000 UTC m=+41.754379780 level info partition 1 component kafka_client msg immediate metadata update triggered why fetch had inner topic errors from broker 0: OFFSET_OUT_OF_RANGE{test[1]} logger.go:38: 2024-10-10 04:17:01.308063101 +0000 UTC m=+41.754822469 level info partition 1 target_lag 1s max_lag 1s msg partition reader found no records to consume because partition is empty partition_start_offset 2 last_produced_offset 1 logger.go:38: 2024-10-10 04:17:01.308121991 +0000 UTC m=+41.754881360 level info partition 1 target_lag 1s max_lag 1s msg partition reader consumed partition and current lag is lower than configured target consumer lag last_consumed_offset 1 current_lag 0s logger.go:38: 2024-10-10 04:17:01.308204095 +0000 UTC m=+41.754963463 level info partition 1 component kafka_client msg assigning partitions why remove of map[test:[1]] requested how unassigning any currently assigned matching partition that is in the input input test[1{0 e0 ce0}] logger.go:38: 2024-10-10 04:17:01.31006444 +0000 UTC m=+41.756823808 level info partition 1 msg stopping partition reader logger.go:38: 2024-10-10 04:17:01.311[87](https://github.com/grafana/mimir/actions/runs/11267325901/job/31332271953?pr=9527#step:8:88)7194 +0000 UTC m=+41.758636563 partition 1 fetcher 0 caller log.go:168 level debug msg fetched records duration 2.006076292s start_offset 4 end_offset 6 asked_records 2 got_records 0 diff_records 2 asked_bytes 1000000 got_bytes 0 diff_bytes 1000000 first_timestamp last_timestamp hwm 0 lso 0 err logger.go:38: 2024-10-10 04:17:01.311967163 +0000 UTC m=+41.758726551 partition 1 fetcher 0 caller log.go:168 level debug msg merged fetch result with the next result logger.go:38: 2024-10-10 04:17:01.312043387 +0000 UTC m=+41.75[88](https://github.com/grafana/mimir/actions/runs/11267325901/job/31332271953?pr=9527#step:8:89)02755 partition 1 fetcher 1 caller log.go:168 level debug msg fetched records duration 1.005334679s start_offset 2 end_offset 4 asked_records 2 got_records 0 diff_records 2 asked_bytes 1000000 got_bytes 0 diff_bytes 1000000 first_timestamp last_timestamp hwm 0 lso 0 err logger.go:38: 2024-10-10 04:17:01.312132574 +0000 UTC m=+41.758[89](https://github.com/grafana/mimir/actions/runs/11267325901/job/31332271953?pr=9527#step:8:90)1942 partition 1 fetcher 1 caller log.go:168 level debug msg merged fetch result with the next result logger.go:38: 2024-10-10 04:17:01.312423439 +0000 UTC m=+41.75[91](https://github.com/grafana/mimir/actions/runs/11267325901/job/31332271953?pr=9527#step:8:92)82797 level warn partition 1 msg failed to fetch the last produced offset err context canceled testing.go:1399: race detected during execution of test FAIL ```

dimitarvdimitrov commented 1 month ago

These last occurrences include some races in franz-go and some races in using the testing.T.Log. I opened a PR to fix the latter https://github.com/grafana/mimir/pull/9579