thanos-io / thanos

Highly available Prometheus setup with long term storage capabilities. A CNCF Incubating project.
https://thanos.io
Apache License 2.0
13.15k stars 2.1k forks source link

Swift upload: "new bucket block: list chunk files: Cannot extract names from response with content-type: []" #713

Closed tivvit closed 4 years ago

tivvit commented 5 years ago

Hi

@sudhi-vm

Thanos, Prometheus and Golang version used thanos, version 0.2.1 (branch: HEAD, revision: 30e7cbdafd3ef01189f202945c2728fcf37e1cf1) build user: root@79ffcf51ff9b build date: 20181227-15:44:56 go version: go1.11.4

What happened I am having a problem with SWIFT store API

I can see it both in the sidecar and in store

What you expected to happen The sidecar should be able to close files

The store does not even start

How to reproduce it (as minimally and precisely as possible): started sidecar with configured SWIFT

Full logs to relevant components Here is the log from the sidecar

level=info ts=2019-01-07T13:00:32.764722917Z caller=shipper.go:201 msg="upload new block" id=01D0M6C05AQ76KYERF02E7JRNY
level=warn ts=2019-01-07T13:00:32.827598341Z caller=runutil.go:69 msg="detected close error" err="close file /data/thanos/upload/01D0M6C05AQ76KYERF02E7JRNY/meta.json: close /data/thanos/upload/01D0M6C05AQ76KYERF02E7JRNY/meta.json: file already closed"
level=warn ts=2019-01-07T13:00:36.581858197Z caller=runutil.go:69 msg="detected close error" err="close file /data/thanos/upload/01D0M6C05AQ76KYERF02E7JRNY/chunks/000001: close /data/thanos/upload/01D0M6C05AQ76KYERF02E7JRNY/chunks/000001: file already closed"
level=warn ts=2019-01-07T13:00:37.212250194Z caller=runutil.go:69 msg="detected close error" err="close file /data/thanos/upload/01D0M6C05AQ76KYERF02E7JRNY/index: close /data/thanos/upload/01D0M6C05AQ76KYERF02E7JRNY/index: file already closed"
level=warn ts=2019-01-07T13:00:37.28117964Z caller=runutil.go:69 msg="detected close error" err="close file /data/thanos/upload/01D0M6C05AQ76KYERF02E7JRNY/meta.json: close /data/thanos/upload/01D0M6C05AQ76KYERF02E7JRNY/meta.json: file already closed"

and from store

level=info ts=2019-01-07T13:08:19.353763442Z caller=factory.go:39 msg="loading bucket configuration"
level=warn ts=2019-01-07T13:08:20.34939541Z caller=bucket.go:240 msg="loading block failed" id=01D0KRMHPN540PRCCYY7JVDS03 err="new bucket block: list chunk files: Cannot extract names from response with content-type: []"
level=warn ts=2019-01-07T13:08:20.479011839Z caller=bucket.go:240 msg="loading block failed" id=01D0KZG8Y99KDN9AWZ61QH284Z err="new bucket block: list chunk files: Cannot extract names from response with content-type: []"
level=warn ts=2019-01-07T13:08:20.563062407Z caller=bucket.go:240 msg="loading block failed" id=01D0M6C05AQ76KYERF02E7JRNY err="new bucket block: list chunk files: Cannot extract names from response with content-type: []"
store command failed: bucket store initial sync: sync block: iter: Cannot extract names from response with content-type: []

and files on SWIFT

 94M 2019-01-07 12:52:57 application/octet-stream 01D0KRMHPN540PRCCYY7JVDS03/chunks/000001
9.9M 2019-01-07 12:52:59 application/octet-stream 01D0KRMHPN540PRCCYY7JVDS03/index
 416 2019-01-07 12:53:00         application/json 01D0KRMHPN540PRCCYY7JVDS03/meta.json
 96M 2019-01-07 12:53:01 application/octet-stream 01D0KZG8Y99KDN9AWZ61QH284Z/chunks/000001
9.9M 2019-01-07 12:53:04 application/octet-stream 01D0KZG8Y99KDN9AWZ61QH284Z/index
 416 2019-01-07 12:53:04         application/json 01D0KZG8Y99KDN9AWZ61QH284Z/meta.json
 88M 2019-01-07 13:00:34 application/octet-stream 01D0M6C05AQ76KYERF02E7JRNY/chunks/000001
9.9M 2019-01-07 13:00:36 application/octet-stream 01D0M6C05AQ76KYERF02E7JRNY/index
 416 2019-01-07 13:00:37         application/json 01D0M6C05AQ76KYERF02E7JRNY/meta.json
 416 2019-01-07 12:52:55         application/json debug/metas/01D0KRMHPN540PRCCYY7JVDS03.json
 416 2019-01-07 12:53:00         application/json debug/metas/01D0KZG8Y99KDN9AWZ61QH284Z.json
 416 2019-01-07 13:00:32         application/json debug/metas/01D0M6C05AQ76KYERF02E7JRNY.json
bwplotka commented 5 years ago

CC Swift client owner: @sudhi-vm

I think there are 2 issues here. Close error is not blocking anything I think (right?), but store looks like is blocking block loading.

tivvit commented 5 years ago

Yes the close issue looks like double close and the file looks like it was uploaded. The close error is no blocking - the sidecar is still working.

The store does not start because of the block load. But I am really not sure if the file is somehow corrupted from the sidecar or what is the problem.

pennpeng commented 5 years ago

@tivvit do you have any docs or blog to configuration thanos to use openstack swift? please help me。

tivvit commented 5 years ago

@pennpeng I will publish the configuration. The configuration looks valid and is uploading files but it cannot be used due to the bug described in this issue.

bwplotka commented 5 years ago

@tivvit do you have time to give us a hand and run go test -v ./pkg/objstore/objtesting/... with export THANOS_SKIP_S3_AWS_TESTS='true' && export THANOS_SKIP_AZURE_TESTS='true' && export THANOS_SKIP_GCS_TESTS='true' && export THANOS_SKIP_TENCENT_COS_TESTS='true and Swift credentials configured?

This checks basic behaviour of the client. Might be something simple broken and reproducible by this acceptance test

FUSAKLA commented 5 years ago

@tivvit let me know if you need some help with this. Or if you struggle to find the time pass me the credentials and config internally and I can elaborate more on this. (we happen to be from the same company)

sudhi-vm commented 5 years ago

@bwplotka, @tivvit Just saw this, I'll take a look at it.

FlorinPeter commented 5 years ago

I ran into the same issue by using v0.6.1

level=info ts=2019-09-02T14:23:55.832742858Z caller=main.go:154 msg="Tracing will be disabled"
level=info ts=2019-09-02T14:23:55.833221338Z caller=main.go:274 component=sidecar msg="disabled TLS, key and cert must be set to enable"
level=info ts=2019-09-02T14:23:55.833250064Z caller=factory.go:39 msg="loading bucket configuration"
level=info ts=2019-09-02T14:23:55.966894111Z caller=sidecar.go:289 msg="starting sidecar"
level=info ts=2019-09-02T14:23:55.967453772Z caller=reloader.go:154 component=reloader msg="started watching config file and non-recursively rule dirs for changes" cfg= out= dirs=
level=info ts=2019-09-02T14:23:55.968113018Z caller=main.go:326 msg="Listening for metrics" address=[10.128.2.45]:10902
level=info ts=2019-09-02T14:23:55.968210207Z caller=sidecar.go:222 component=sidecar msg="Listening for StoreAPI gRPC" address=[10.128.2.45]:10901
level=warn ts=2019-09-02T14:23:55.97165183Z caller=sidecar.go:301 msg="failed to get Prometheus flags. Is Prometheus running? Retrying" err="request config against http://localhost:9090/api/v1/status/flags: Get http://localhost:9090/api/v1/status/flags: dial tcp [::1]:9090: connect: connection refused"
level=info ts=2019-09-02T14:23:57.973213786Z caller=sidecar.go:154 msg="successfully loaded prometheus external labels" external_labels="{prometheus=\"mcs-monitoring/k8s\",prometheus_replica=\"prometheus-k8s-0\"}"
level=info ts=2019-09-02T17:24:30.016366224Z caller=shipper.go:349 msg="upload new block" id=01DKSG37DTFBQ2YFYP44MKJC60
level=warn ts=2019-09-02T17:24:30.087598936Z caller=runutil.go:108 msg="detected close error" err="close file /prometheus/thanos/upload/01DKSG37DTFBQ2YFYP44MKJC60/meta.json: close /prometheus/thanos/upload/01DKSG37DTFBQ2YFYP44MKJC60/meta.json: file already closed"
level=warn ts=2019-09-02T17:24:30.139468599Z caller=runutil.go:108 msg="detected close error" err="close file /prometheus/thanos/upload/01DKSG37DTFBQ2YFYP44MKJC60/chunks/000001: close /prometheus/thanos/upload/01DKSG37DTFBQ2YFYP44MKJC60/chunks/000001: file already closed"
level=warn ts=2019-09-02T17:24:30.187988163Z caller=runutil.go:108 msg="detected close error" err="close file /prometheus/thanos/upload/01DKSG37DTFBQ2YFYP44MKJC60/index: close /prometheus/thanos/upload/01DKSG37DTFBQ2YFYP44MKJC60/index: file already closed"
level=warn ts=2019-09-02T17:24:30.227617691Z caller=runutil.go:108 msg="detected close error" err="close file /prometheus/thanos/upload/01DKSG37DTFBQ2YFYP44MKJC60/meta.json: close /prometheus/thanos/upload/01DKSG37DTFBQ2YFYP44MKJC60/meta.json: file already closed"

I also run go test -v ./pkg/objstore/objtesting with the same sidecar credentials.

=== RUN   TestObjStore_AcceptanceTest_e2e
=== RUN   TestObjStore_AcceptanceTest_e2e/inmem
=== RUN   TestObjStore_AcceptanceTest_e2e/swift
--- PASS: TestObjStore_AcceptanceTest_e2e (8.53s)
    --- PASS: TestObjStore_AcceptanceTest_e2e/inmem (0.00s)
    foreach.go:48: THANOS_SKIP_GCS_TESTS envvar present. Skipping test against GCS.
    foreach.go:70: THANOS_SKIP_S3_AWS_TESTS envvar present. Skipping test against S3 AWS.
    foreach.go:86: THANOS_SKIP_AZURE_TESTS envvar present. Skipping test against Azure.
    swift.go:304: created temporary container for swift tests with name test_testobjstore_acceptancetest_e2e_5b71dc0538454db6
    --- PASS: TestObjStore_AcceptanceTest_e2e/swift (5.89s)
    foreach.go:118: THANOS_SKIP_TENCENT_COS_TESTS envvar present. Skipping test against Tencent COS.
PASS
ok      github.com/thanos-io/thanos/pkg/objstore/objtesting 8.564s

Any idea where to start debug?

FlorinPeter commented 5 years ago

after some local tests I found the issue and opened a PR #1489

stale[bot] commented 4 years ago

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

FUSAKLA commented 4 years ago

Still valid. Just tried to migrate to Swift form S3 and failed with the same error message.

Hopefully should be fixed by https://github.com/thanos-io/thanos/pull/2665

stale[bot] commented 4 years ago

Hello 👋 Looks like there was no activity on this issue for last 30 days. Do you mind updating us on the status? Is this still reproducible or needed? If yes, just comment on this PR or push a commit. Thanks! 🤗 If there will be no activity for next week, this issue will be closed (we can always reopen an issue if we need!). Alternatively, use remind command if you wish to be reminded at some point in future.