Open kaparora opened 4 years ago
I too, would highly appreciate seeing Nomad as a first class citizen orchestrator-wise.
+1
We are also waiting for this enhancement.
+1
+1
We've seen that people are interested in using Trident with Nomad but only in this GitHub issue. Please contact your NetApp account team and provide information on the solution you are implementing. This will help us prioritize supporting Nomad based on the customer demand.
We've seen that people are interested in using Trident with Nomad but only in this GitHub issue. Please contact your NetApp account team and provide information on the solution you are implementing. This will help us prioritize supporting Nomad based on the customer demand.
While this is fine and I can do this, CSI is de facto standard now - https://github.com/container-storage-interface/spec/blob/master/spec.md - and should be supported as is, not only in kubernetes deployments. That way it could be used with different products, for example - nomad.
Hi @cantorek,
Thanks for your comment. Trident supports CSI and is tested with multiple environments. Trident should work with a CSI compatible orchestrator that follows the CSI spec. Thanks for agreeing to followup with your NetApp account team.
+1
+1
I've decided to give it a try (running Trident under Nomad) and... it didn't work.
Here is a Nomad job file I wrote (it can be used as a minimal reproducer for my issue):
variable "managementLIF" {
type = string
}
variable "dataLIF" {
type = string
}
variable "svm" {
type = string
}
variable "nas_username" {
type = string
}
variable "nas_password" {
type = string
}
variable "aggregate" {
type = string
}
job "csi-trident" {
datacenters = ["dc1"]
type = "system"
group "csi-trident" {
count = 1
network {
port "https" {
static = 8443
}
}
task "trident_orchestrator_init" {
lifecycle {
hook = "prestart"
sidecar = false
}
driver = "docker"
config {
image = "cfssl/cfssl"
entrypoint = ["/bin/bash"]
args = ["/local/gen_certs.sh"]
mount {
type = "volume"
target = "/certs"
source = "certs"
readonly = false
}
}
template {
data = <<EOF
{
"CN": "trident-ca",
"key": { "algo": "rsa", "size": 2048 },
"names": [ {"C": "US", "ST": "TX"} ]
}
EOF
destination = "local/ca-csr.json"
}
template {
data = <<EOF
{
"signing": {
"default": {
"expiry": "8760h"
},
"profiles": {
"trident-node": {
"expiry": "8760h",
"usages": [
"signing",
"key encipherment",
"client auth"
]
},
"trident-server": {
"expiry": "8760h",
"usages": [
"signing",
"key encipherment",
"server auth"
]
}
}
}
}
EOF
destination = "local/ca-config.json"
}
template {
data = <<EOF
{
"CN": "trident-node",
"key": { "algo": "rsa", "size": 2048 },
"names": [ {"C": "US", "ST": "TX"} ]
}
EOF
destination = "local/trident-node-csr.json"
}
template {
data = <<EOF
{
"CN": "trident-server",
"key": { "algo": "rsa", "size": 2048 },
"names": [ {"C": "US", "ST": "TX"} ],
"hosts": [ "trident-csi" ]
}
EOF
destination = "local/trident-server-csr.json"
}
template {
data = <<EOF
set -eux -o pipefail
cd /certs
test -f aesKey || head -c 32 < /dev/urandom | base64 > aesKey
cfssl gencert -initca /local/ca-csr.json | cfssljson -bare ca
cfssl gencert -ca ca.pem -ca-key ca-key.pem -config /local/ca-config.json \
-profile=trident-server /local/trident-server-csr.json | cfssljson -bare server
cfssl gencert -ca ca.pem -ca-key ca-key.pem -config /local/ca-config.json \
-profile=trident-node /local/trident-node-csr.json | cfssljson -bare node
EOF
destination = "local/gen_certs.sh"
}
}
task "trident_orchestrator" {
driver = "docker"
env {
TRIDENT_CSI_SERVICE_PORT = "8443"
}
config {
image = "netapp/trident:21.04.0"
entrypoint = ["/trident_orchestrator"]
args = [
"--aes_key=/certs/aesKey",
"--config=local/netappdvp.json",
"--address=0.0.0.0",
"--port=8000",
"--rest",
"--https_address=0.0.0.0",
"--https_port=8443",
"--https_rest",
"--https_server_cert=/certs/server.pem",
"--https_server_key=/certs/server-key.pem",
"--csi_node_name=${node.unique.name}",
"--csi_endpoint=unix://csi/csi.sock",
"--csi_role=allInOne",
"--https_ca_cert=/certs/ca.pem",
"--https_client_cert=/certs/node.pem",
"--https_client_key=/certs/node-key.pem",
"--passthrough"
]
ports = ["https"]
mount {
type = "volume"
target = "/certs"
source = "certs"
readonly = true
}
mount {
type = "bind"
source = "local/hosts"
target = "/etc/hosts"
}
}
template {
data = <<EOF
{{ env "NOMAD_HOST_IP_https" }} trident-csi
EOF
destination = "local/hosts"
}
template {
data = <<EOF
{
"version": 1,
"storageDriverName": "ontap-nas",
"managementLIF": "${var.managementLIF}",
"dataLIF": "${var.dataLIF}",
"svm": "${var.svm}",
"username": "${var.nas_username}",
"password": "${var.nas_password}",
"aggregate": "${var.aggregate}"
}
EOF
destination = "local/netappdvp.json"
}
csi_plugin {
id = "trident-csi"
type = "monolith"
mount_dir = "/csi"
}
}
}
}
There is a lot of SSL stuff and apparently, there is no way to run it without SSL even for testing. I'm running Trident in "allInOne" mode, so it is in both controller and node roles.
Nomad detects the CSI plugin successfully:
Then I'm trying to create a volume. Here is my volume.hcl
file:
id = "test_netapp_volume"
name = "test_netapp_volume"
type = "csi"
plugin_id = "trident-csi"
capacity_min = "10GiB"
capacity_max = "20G"
capability {
access_mode = "single-node-writer"
attachment_mode = "file-system"
}
mount_options {
fs_type = "ext4"
mount_flags = []
}
It gets created without issues:
# nomad volume create volume.hcl
Created external volume test_netapp_volume with ID test_netapp_volume
Now, I'm trying to submit a job that would mount this volume:
job "csi-job" {
datacenters = ["dc1"]
type = "service"
group "csi-job" {
count = 1
network {
port "http" {
to = 80
}
}
volume "test_netapp_volume" {
type = "csi"
source = "test_netapp_volume"
read_only = false
attachment_mode = "file-system"
access_mode = "single-node-writer"
}
task "www" {
driver = "docker"
config {
image = "nginx"
ports = ["http"]
}
volume_mount {
volume = "test_netapp_volume"
destination = "/usr/share/nginx/html"
}
}
}
}
And this job won't start, because:
failed to setup alloc: pre-run hook "csi_hook" failed: node plugin returned an internal error, check the plugin allocation logs for more information: rpc error: code = Internal desc = rpc error: code = Internal desc = open /var/lib/trident/tracking/test_netapp_volume.json: no such file or directory
Relevant logs from trident controller:
time="2021-06-18T21:13:43Z" level=warning msg="Node preparation for NFS failed; NFS mounts to this node may fail." FailureReason="error preparing NFS packages on the host; could not determine NFS packages; unsupported Linux distro" requestID=d7a71127-525a-4ec5-b78d-b4a0b3b4c121 requestSource=CSI
time="2021-06-18T21:13:43Z" level=error msg="Unable to write tracking file." error="open /var/lib/trident/tracking/test_netapp_volume.json: no such file or directory" requestID=d7a71127-525a-4ec5-b78d-b4a0b3b4c121 requestSource=CSI volumeId=test_netapp_volume
time="2021-06-18T21:13:43Z" level=error msg="GRPC error: rpc error: code = Internal desc = rpc error: code = Internal desc = open /var/lib/trident/tracking/test_netapp_volume.json: no such file or directory" requestID=d7a71127-525a-4ec5-b78d-b4a0b3b4c121 requestSource=CSI
time="2021-06-18T21:13:43Z" level=info msg="target path (/csi/per-alloc/3dba91f2-a169-07f8-c272-7b714b1d8924/test_netapp_volume/rw-file-system-single-node-writer) not found; volume is not mounted." Method=NodeUnpublishVolume Type=CSI_Node requestID=6f607a5c-6c88-4c3a-a4b7-5d278770a66e requestSource=CSI
time="2021-06-18T21:13:43Z" level=error msg="Removing tracking file failed." error="remove /var/lib/trident/tracking/test_netapp_volume.json: no such file or directory" requestID=67f7bf81-23e8-45de-9618-d40647cd769f requestSource=CSI trackingFilename=/var/lib/trident/tracking/test_netapp_volume.json
time="2021-06-18T21:13:43Z" level=error msg="Failed to remove tracking file: remove /var/lib/trident/tracking/test_netapp_volume.json: no such file or directory" requestID=67f7bf81-23e8-45de-9618-d40647cd769f requestSource=CSI volumeId=test_netapp_volume
Searching for the "Unable to write tracking file" in source code leads to the Frontend component: https://github.com/NetApp/trident/blob/master/frontend/csi/node_server.go#L1145
And this is as well what our NetApp support contact told us that the Frontend is missing some pieces required to support other orchestrators except the Kubernetes. 😞
Hope NetApp would prioritize work on this issue!
@gnarl this issue is preventing us to use the native connection of netapp, forcing us to fall back on nfs services. Disappointing.
Hi @michimau,
Please contact your NetApp account team and ask them to work to increase visibility on the request to add Nomad support to Trident.
Hello,
I'm digging up an old topic, but has there been any news on this subject since then?
Describe the solution you'd like Documentation around how to get Trident running with any other CSI compatible Orchestrator. HashiCorp Nomad will support CSI starting version 0.11 (https://www.hashicorp.com/blog/hashicorp-nomad-container-storage-interface-csi-beta/) What will it take to get Trident working with Nomad. Is it possible? if yes, to what extent.? What kind of effort will be required to implement/support this.
Describe alternatives you've considered I could not find any documentation or guidance around this topic.
Additional context none