datashim-io / datashim

A kubernetes based framework for hassle free handling of datasets
http://datashim-io.github.io/datashim
Apache License 2.0
481 stars 68 forks source link

Bug: pod for downloading archive datasets does not support arm64 architecture #367

Closed AlessandroPomponio closed 3 months ago

AlessandroPomponio commented 4 months ago

What happened: After creating a dataset of type ARCHIVE, I see no content in the pod that mounts the dataset. This is because the download pod fails with

exec /bin/sh: exec format error

What you expected to happen: everything should work

How to reproduce it (as minimally and precisely as possible):

  1. Run
    kubectl apply -n dlf -f https://github.com/AlessandroPomponio/datashim/blob/ap_365_readme_updates/examples/minio/minio.yaml
  2. Run
    cat <<EOF | kubectl apply -f -
    apiVersion: v1
    kind: Pod
    metadata:
      name: nginx
      labels:
        dataset.0.id: "archive-dataset"
        dataset.0.useas: "mount"
    spec:
      containers:
        - name: nginx
          image: nginx
    EOF
  3. Run
    kubectl exec nginx -- ls mnt/datasets/archive-dataset/

Anything else we need to know?:

Environment:

srikumar003 commented 3 months ago

The function here: https://github.com/datashim-io/datashim/blob/da435c8e470cdbfb61a5772efbc9c97cf4586698/src/dataset-operator/controllers/archive_handler.go#L14 references a docker image that is not built with multi-arch support. The Dockerfile for that image is here: https://github.com/datashim-io/datashim/blob/da435c8e470cdbfb61a5772efbc9c97cf4586698/src/cos-uploader/Dockerfile#L2

To fix this issue:

AlessandroPomponio commented 3 months ago

Everything seems to have worked https://github.com/datashim-io/datashim/actions/runs/10167107008/job/28122136358 cc @srikumar003

srikumar003 commented 3 months ago

Closing this issue