Closed nicknezis closed 2 years ago
This should be rather straightforward to implement for the CLI. We can use KubernetesController.getConfigItemsByPrefix
to collect all the relevant options from the config. As per our discussions, we should be using dynamic provisioning?
Suggested commands are below based on the Spark example, anything else we should add?
--config-property heron.kubernetes.persistentVolumeClaim.[volume name].options.claimName=OnDemand
--config-property heron.kubernetes.persistentVolumeClaim.[volume name].options.storageClass=gp
--config-property heron.kubernetes.persistentVolumeClaim.[volume name].options.sizeLimit=500Gi
--config-property heron.kubernetes.persistentVolumeClaim.[volume name].mount.path=/data
--config-property heron.kubernetes.persistentVolumeClaim.[volume name].mount.readOnly=false
I can start work on this but I can only take it as far as where it needs to be wired into #3710. I will then need to rebase
onto that PR and wire it in. I will need also to clean up the test suite and perform some other general merge-conflict like clean-up operations at that point.
An idea that I had is a workflow where users put all their K8s configs, including the pod template, into a directory and then load them into a ConfigMap. The configs users wish to have loaded into the containers is then provided using --config-property
.
Looking through the Spark documentation there seem to be the following options supported:
claimName
storageClass
sizeLimit
path
subPath
readOnly
There is a multitude of options available on the K8s API.
I have the PVC assembly part of the PR completed and I am now working on wiring all this up to make sure it works correctly with custom Pod Templates.
Commands:
--config-property heron.kubernetes.volumes.persistentVolumeClaim.volumeNameOfChoice.claimName=nameOfVolumeClaim
--config-property heron.kubernetes.volumes.persistentVolumeClaim.volumeNameOfChoice.storageClassName=storageClassNameOfChoice
--config-property heron.kubernetes.volumes.persistentVolumeClaim.volumeNameOfChoice.accessModes=comma,separated,list
--config-property heron.kubernetes.volumes.persistentVolumeClaim.volumeNameOfChoice.sizeLimit=555Gi
--config-property heron.kubernetes.volumes.persistentVolumeClaim.volumeNameOfChoice.volumeMode=volumeModeOfChoice
--config-property heron.kubernetes.volumes.persistentVolumeClaim.volumeNameOfChoice.path=path/to/mount
--config-property heron.kubernetes.volumes.persistentVolumeClaim.volumeNameOfChoice.subPath=sub/path/to/mount
Will generate the PVC:
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: nameOfVolumeClaim
spec:
volumeName: volumeNameOfChoice
accessModes:
- comma
- separated
- list
volumeMode: volumeModeOfChoice
resources:
requests:
storage: 555Gi
storageClassName: storageClassNameOfChoice
Entries will be made in the Pod for a Volume
and in the executor
container for the VolumeMount
with the path
as well as the subPath
, as required.
The commands above are all that I have added for now but the code is designed so that you can easily add an enum for the PVC property. You would then need to add an entry to the switch statement which adds it to the actual PVC. This should make things more maintainable and significantly more extensible.
We would like to add a set of submit parameters that allow for specifying PersistentVolumeClaims and mount points similar to the feature found in Spark (described here: https://spark.apache.org/docs/latest/running-on-kubernetes.html#using-kubernetes-volumes)