airbytehq / airbyte

The leading data integration platform for ETL / ELT data pipelines from APIs, databases & files to data warehouses, data lakes & data lakehouses. Both self-hosted and Cloud-hosted.
https://airbyte.com
Other
15.72k stars 4.03k forks source link

[helm] MinIO statefulset not working #36801

Open luis-fnogueira opened 6 months ago

luis-fnogueira commented 6 months ago

Helm Chart Version

0.63.10

What step the error happened?

On deploy

Revelant information

During deploy of Airbyte, the MinIO statefulset doesn't work properly. It gives an access error. The pod's log are below. It keeps restarting: image Maybe the values.yaml file lacks MinIO configuration?

Relevant log output

API: SYSTEM()
Time: 16:40:26 UTC 04/03/2024
Error: unable to rename (/storage/.minio.sys/tmp -> /storage/.minio.sys/tmp-old/ed17c586-f81b-4574-8b2b-5311a854498e) file access denied, drive may be faulty please investigate (*fmt.wrapError)
       6: internal/logger/logger.go:258:logger.LogIf()
       5: cmd/prepare-storage.go:89:cmd.bgFormatErasureCleanupTmp()
       4: cmd/xl-storage.go:263:cmd.newXLStorage()
       3: cmd/object-api-common.go:63:cmd.newStorageAPI()
       2: cmd/format-erasure.go:673:cmd.initStorageDisksWithErrors.func1()
       1: github.com/minio/pkg/v2@v2.0.3-0.20231107172951-8a60b89ec9b4/sync/errgroup/errgroup.go:123:errgroup.(*Group).Go.func1()

API: SYSTEM()
Time: 16:40:26 UTC 04/03/2024
Error: unable to create (/storage/.minio.sys/tmp) file access denied, drive may be faulty please investigate (*fmt.wrapError)
       6: internal/logger/logger.go:258:logger.LogIf()
       5: cmd/prepare-storage.go:96:cmd.bgFormatErasureCleanupTmp()
       4: cmd/xl-storage.go:263:cmd.newXLStorage()
       3: cmd/object-api-common.go:63:cmd.newStorageAPI()
       2: cmd/format-erasure.go:673:cmd.initStorageDisksWithErrors.func1()
       1: github.com/minio/pkg/v2@v2.0.3-0.20231107172951-8a60b89ec9b4/sync/errgroup/errgroup.go:123:errgroup.(*Group).Go.func1()
ERROR Unable to use the drive /storage: file access denied: Invalid arguments specified
luis-fnogueira commented 6 months ago

I tried to connect to an external DB and this situation is impairing the connection: Internal Server Error: Unable to execute HTTP request: Connect to airbyte-minio-svc:9000 [airbyte-minio-svc/172.20.93.157] failed: Connection refused

luis-fnogueira commented 6 months ago

I changed the securityContext in the templates of Airbyte's chart to:

      securityContext:
        allowPrivilegeEscalation: true 
        runAsNonRoot: false
        # uid=1000(airbyte)
        runAsUser: 0
        # gid=1000(airbyte)
        runAsGroup: 0
        readOnlyRootFilesystem: false
        capabilities:
          drop: ["ALL"]
        seccompProfile:
          type: RuntimeDefault

But I don't know how much unsafe this is.

Ahsalis commented 5 months ago

@luis-fnogueira Did the minio pod start working with these permissions? I'm having the same issue you described with the helm chart version 0.63.33.

Ahsalis commented 5 months ago

This issue is fixed in helm chart version 0.64.52.

ollie-nye commented 5 months ago

We're still seeing the same issue on 0.64.52, the minio pod crashlooping with ERROR Unable to use the drive /storage: drive access denied: Invalid arguments specified

ollie-nye commented 5 months ago

Traced it back to https://github.com/airbytehq/airbyte-platform/commit/ac2c24ad23a28e0891499edcf0027d8ab7f88e1c changing the default UID and GID of the running pod, so it can't read files it used to have access to

rensoostenbachBL commented 4 months ago

Will this be fixed? I am running into the same issue while being on Helm chart version 0.87.4. Downgrading Airbyte didn't work anymore because there had already been a schema change that is incompatible with lower versions

p-vrachnis commented 4 months ago

Hi, deleting the pvc and the pod should make it work.

rensoostenbachBL commented 4 months ago

Can confirm @p-vrachnis, appreciate the help!

knuurr commented 5 days ago

Hi, deleting the pvc and the pod should make it work.

Do you know what is contained within that pvc? I mean, what kind of data does Airbyte store here and is it safe to just delete it and recreate?

I inherited our deployment from previous person. This is what I see in my values.yml regarding Minio:

          values: |
            global:
              state:
                storage:
                  type: "MINIO"
              logs:
                storage:
                  type: "MINIO"
                minio:
                  enabled: true

The way I understand this it should be logs, so nothing critical (like created workflows, for example). But I prefer to ask.

p-vrachnis commented 5 days ago

It should be only logs. We did not noticed any other data missing, in our case at least.