Open parallelo opened 1 year ago
Still digging... seems like the mountPaths
are goofed up in the Argo Workflow?
filebrowser pod (WORKS CORRECTLY):
volumeMounts:
- mountPath: /data/finetune-data
name: finetune-data
...
volumes:
- name: finetune-data
persistentVolumeClaim:
claimName: finetune-data
finetune-model-tokenizer pod (READ PERMISSION ERROR):
volumeMounts:
- mountPath: /finetune-data
name: finetune-data
...
volumes:
- name: finetune-data
persistentVolumeClaim:
claimName: finetune-data
Edit: Previously referenced mainctrfs
, but that was just the wait container. Now just looking into mountPath
values set as:
/data/finetune-data
(working)/finetune-data
(not working)
Hi! I'm working on reproducing your Argo workflow for fine-tuning GPT-J.
I'm able to create a PVC, download the dataset into it, and submit the argo workflow.
However, whenever I try to read the dataset in the
tokenizer
step of the workflow, it hits a filesystem access error for the PVC:I tried updating the roles / role bindings for accessing the PVC but that still has issues:
The events listed for the relevant
tokenizer
pods do not show any warnings/errors for attaching to the PVC.Still troubleshooting... must be missing some further permissions somewhere. Please let me know if you have suggestions in the meantime. Thanks in advance!