yandex-cloud / k8s-csi-s3

GeeseFS-based CSI for mounting S3 buckets as PersistentVolumes
Other
556 stars 97 forks source link

Cannot create S3 volume on ceph, having "The provided 'x-amz-content-sha256' header does not match what was computed." error #139

Open zentavr opened 2 months ago

zentavr commented 2 months ago

I'd created the PVC like:

---
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: zabbix-db-dump
  annotations:
    helm.sh/resource-policy: keep
spec:
  accessModes:
    - ReadWriteMany
  resources:
    requests:
      storage: 750Gi
  storageClassName: csi-s3

The CSI driver is the latest 0.41.1 and installed with the values:

---
storageClass:
  # Specifies whether the storage class should be created
  create: true
  singleBucket: "cti-k8s-csi"

secret:
  # Specifies whether the secret should be created
  create: true
  # Name of the secret
  name: "csi-s3-secret"
  # S3 Access Key
  accessKey: "*******redacted*******"
  # S3 Secret Key
  secretKey: "*******redacted*******"
  # Endpoint
  endpoint: "http://rgw-slow-dev01.ti.local:80"

There are other volumes from the old deployments which are fine. What could be wrong here? Events: Reason: ExternalProvisioning (source: persistentvolume-controller):

Waiting for a volume to be created either by the external provisioner 'ru.yandex.s3.csi' or manually by the system administrator. If volume creation is delayed, please verify that the provisioner is running and correctly registered.

Reason: ProvisioningFailed (source: ru.yandex.s3.csi_csi-s3-provisioner-0_4d383a87-31f4-42c5-b9c9-58d2c6c41063)

failed to provision volume with StorageClass "csi-s3": rpc error: code = Unknown desc = failed to create prefix pvc-a555057d-22e8-489d-bc7c-86e678f2d83f: The provided 'x-amz-content-sha256' header does not match what was computed.
zentavr commented 2 months ago

S3 browser says:

s3cmd        --acl-private        --access_key=*******redacted*******        --secret_key=*******redacted*******        --host=127.0.0.1        --host-bucket=127.0.0.1        --ssl        --no-check-certificate ls s3://cti-k8s-csi
                       DIR   s3://cti-k8s-csi/pvc-cbd4812e-f26f-4d62-b591-65bb41f42a00/
                       DIR   s3://cti-k8s-csi/pvc-cf7ff348-77c6-4f9e-a3e1-54f00d645d7f/
                       DIR   s3://cti-k8s-csi/pvc-ec4d9b57-c291-447d-8e92-e7825b9ac77c/

...so it shows pvc-folders for the old PVCs

zentavr commented 2 months ago

We upgraded Ceph cluster from 17.2.7 to 18.2.4 and the issue is still there

zentavr commented 2 months ago

I was able to capture HTTP requests, they are here https://gist.github.com/zentavr/facecab1db8376cb9fc011e1e955e68d

zentavr commented 2 months ago

Probably related to: https://tracker.ceph.com/issues/63153 I'd noticed in my sniffed data that when there is a header like:

X-Amz-Content-Sha256: STREAMING-AWS4-HMAC-SHA256-PAYLOAD

...the whole request:

PUT /cti-k8s-csi/pvc-8c2c2930-681a-4ad1-9f35-64b4ecdba2e0/ HTTP/1.1
Host: rgw-slow-dev01.ti.local
User-Agent: MinIO (linux; amd64) minio-go/v7.0.5
Transfer-Encoding: chunked
Authorization: AWS4-HMAC-SHA256 Credential=1Z8NJSBMVHQK32YFVK04/20240821/default/s3/aws4_request,SignedHeaders=host;x-amz-content-sha256;x-amz-date;x-amz-decoded-content-length,Signature=dc4974e5608a1bc6b777fac789dd974537eb54e07796827aec4c3a997883727e
Content-Type: application/octet-stream
X-Amz-Content-Sha256: STREAMING-AWS4-HMAC-SHA256-PAYLOAD
X-Amz-Date: 20240821T232356Z
X-Amz-Decoded-Content-Length: 0
Accept-Encoding: gzip

then the issue happens and I have HTTP/1.1 400 Bad Request reply.

zentavr commented 2 months ago

The workaround now is to create the prefix-folder manually, then also patch manually the PVC and create the volume manually as well.