Closed samos123 closed 1 year ago
Edit: This has been resolved, was a bug in controller
Weird, the dataset loader pod isn't getting any environment variables when using params. Dataset spec:
apiVersion: substratus.ai/v1
kind: Dataset
metadata:
name: k8s-instructions
spec:
params:
urls: https://huggingface.co/datasets/substratusai/k8s-instructions/raw/main/k8s-instructions.jsonl
image:
git:
url: https://github.com/substratusai/images
path: dataset-loader-http
branch: dataset-http-loader
Error in the pod:
ValueError Traceback (most recent call last)
Cell In[2], line 3
1 urls = os.environ.get("PARAM_URLS")
2 if not urls:
----> 3 raise ValueError("Missing required environment variable PARAM_URLS. "
4 "For example, set `spec.params: {urls: http://s.com/dataset.jsonl}` "
5 "in the Dataset resource")
7 urls = urls.strip().split(",")
8 urls
ValueError: Missing required environment variable PARAM_URLS. For example, set `spec.params: {urls: http://s.com/dataset.jsonl}
` in the Dataset resource
Looking at the pod spec there are no environment variables set:
load:
Container ID: containerd://923dd119853ec66b31f2584829eba3df5b54f8415cd109b865ecaa519a03a807
Image: us-central1-docker.pkg.dev/sam-argolis/substratus/substratus-dataset-default-k8s-instructions
Image ID: us-central1-docker.pkg.dev/sam-argolis/substratus/substratus-dataset-default-k8s-instructions@sha256:cac918
196e7e2bf37b2ba13ebb3e88e5416ad378c54076e0ff8eeddbf129da9a
Port: <none>
Host Port: <none>
State: Terminated
Reason: Error
Exit Code: 1
Started: Fri, 21 Jul 2023 22:22:42 -0700
Finished: Fri, 21 Jul 2023 22:22:45 -0700
Ready: False
Restart Count: 0
Requests:
cpu: 2
memory: 4Gi
Environment: <none>
Mounts:
/content/data from dataset (rw,path="d2ef7bd1e58854a4276474790f921613/data")
/content/logs from dataset (rw,path="d2ef7bd1e58854a4276474790f921613/logs")
/content/params.json from params (rw,path="params.json")
/var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-mrmhf (ro)
Fixes #5