Open mmishael opened 3 months ago
I have zero experience with Kubernetes, so hopefully another user who knows how to do it can help you.
Or maybe this pull-request from another (but very similar) project of mine: https://github.com/dockur/windows/pull/304/files
Could someone please help with this?
I started experimenting with installing it on my Harvester Kube cluster.
My manifests are here: https://gitlab.acloud.app/system/harvester/-/tree/main/dsm
Download them, customize for you needs and apply.
Logs:
kubectl logs -f -n dsm dsm
❯ Starting Virtual DSM for Docker v7.12...
❯ For support visit https://github.com/vdsm/virtual-dsm
❯ CPU: Intel Xeon Gold 5218 CPU | RAM: 226/252 GB | DISK: 251 GB (ext4) | HOST: 5.14.21-150400.24.108-default...
❯ Install: Downloading installer...
❯ Install: Downloading DSM_VirtualDSM_69057.pat...
❯ Install: Extracting downloaded image...
❯ Install: Preparing system partition...
❯ Install: Extracting system partition...
❯ Install: Installing system partition...
❯ Creating a 256G growable disk image in qcow2 format...
Formatting '/storage/data.qcow2', fmt=qcow2 cluster_size=65536 extended_l2=off preallocation=off compression_type=zlib size=274877906944 lazy_refcounts=off refcount_bits=16
❯ Booting Virtual DSM...
so far looking good.
I want to test it few days and will send PR...
I noticed, that disk space units doesn't match: Kubernetes uses Gi
and everyone else usually uses GB
. So, my 256Gi
in Kubernetes PVC is only 251GB
as seen by Synology and since I have 256GB in environment that doesn't match...
P.S. as I'm testing, I found that if I open UI via IP port - everything works. But if I use Ingress and open via domain name - I'm getting error:
You are not authorized to use this service.
@mmishael I have successfully run vDSM on Truenas scale 24.04.0 for a period of time and enabled vDSM to obtain IP through DHCP. I have provided some important configuration screenshots, hoping they can be helpful to you.
I started experimenting with installing it on my Harvester Kube cluster.
My manifests are here: https://gitlab.acloud.app/system/harvester/-/tree/main/dsm
Download them, customize for you needs and apply.
Logs:
kubectl logs -f -n dsm dsm ❯ Starting Virtual DSM for Docker v7.12... ❯ For support visit https://github.com/vdsm/virtual-dsm ❯ CPU: Intel Xeon Gold 5218 CPU | RAM: 226/252 GB | DISK: 251 GB (ext4) | HOST: 5.14.21-150400.24.108-default... ❯ Install: Downloading installer... ❯ Install: Downloading DSM_VirtualDSM_69057.pat... ❯ Install: Extracting downloaded image... ❯ Install: Preparing system partition... ❯ Install: Extracting system partition... ❯ Install: Installing system partition... ❯ Creating a 256G growable disk image in qcow2 format... Formatting '/storage/data.qcow2', fmt=qcow2 cluster_size=65536 extended_l2=off preallocation=off compression_type=zlib size=274877906944 lazy_refcounts=off refcount_bits=16 ❯ Booting Virtual DSM...
so far looking good.
I want to test it few days and will send PR...
I noticed, that disk space units doesn't match: Kubernetes uses
Gi
and everyone else usually usesGB
. So, my256Gi
in Kubernetes PVC is only251GB
as seen by Synology and since I have 256GB in environment that doesn't match...
P.S. as I'm testing, I found that if I open UI via IP port - everything works. But if I use Ingress and open via domain name - I'm getting error:
You are not authorized to use this service.
Wow thanks!!! I will look at it.
@mmishael I have successfully run vDSM on Truenas scale 24.04.0 for a period of time and enabled vDSM to obtain IP through DHCP. I have provided some important configuration screenshots, hoping they can be helpful to you.
![]()
![]()
![]()
![]()
![]()
![]()
Thanks!
I did all the steps you noted, got IP, but I get redirected to DSM IP not available on my network ( Did not skip the ENV of DHCP).
Do you know what could be the problem?
There is now an example Kubernetes manifest thanks to @SlavikCA :
https://github.com/vdsm/virtual-dsm/blob/master/kubernetes.yml
The config above is the basic one. I was running more tests, and it looks to me, that most optimal network settings are not just to expose UI port, but for DSM to have dedicated IP in LAN. That configuration can be different depending on Kubernetes cluster setup.
I'm using Harvester 1.3.0 cluster with 2 NIC (one for management and net2
for VMs) and here is my pod config:
apiVersion: v1
kind: Pod
metadata:
name: dsm
labels:
name: dsm
namespace: dsm
annotations:
# https://github.com/k8snetworkplumbingwg/multus-cni/blob/master/docs/how-to-use.md#launch-pod-with-json-annotation
k8s.v1.cni.cncf.io/networks: '[{"name":"net2","namespace":"default","interface":"net2"}]'
spec:
terminationGracePeriodSeconds: 120 # the Kubernetes default is 30 seconds and it may be not enough
containers:
- name: dsm
image: vdsm/virtual-dsm
resources:
limits:
devices.kubevirt.io/kvm: 1
devices.kubevirt.io/vhost-net: 1
securityContext:
privileged: true
capabilities:
add: ["NET_ADMIN"]
env:
- name: RAM_SIZE
value: 4G
- name: CPU_CORES
value: "4"
- name: DISK_SIZE
value: "250G"
- name: DISK_FMT
value: "qcow2"
- name: DHCP
value: "Y"
- name: VM_NET_DEV
value: "net2"
volumeMounts:
- mountPath: /storage
name: dsm256-ssd
volumes:
- name: dsm256-ssd
persistentVolumeClaim:
claimName: dsm256-ssd-pvc
---
apiVersion: v1
kind: Service
metadata:
name: dsm-service
namespace: dsm
spec:
type: ExternalName
externalName: 10.0.4.124
ports:
- protocol: TCP
port: 5000
targetPort: 5000
---
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
name: dsm-ingress
namespace: dsm
annotations:
cert-manager.io/cluster-issuer: "letsencrypt-prod"
spec:
ingressClassName: traefik
tls:
- hosts:
- dsm.acloud.app
secretName: dsm.acloud.app
rules:
- host: dsm.acloud.app
http:
paths:
- path: /
pathType: Prefix
backend:
service:
name: dsm-service
port:
number: 5000
With that config I have my Samba shares working on Windows computers in LAN.
Do you know what could be the problem?
@mmishael , On the screenshots above, it's appears strange to me that VM_NET_DEV has one interface name, but then on another screenshot another interface selected. That can be the problem.
Do you know what could be the problem?
@mmishael , On the screenshots above, it's appears strange to me that VM_NET_DEV has one interface name, but then on another screenshot another interface selected. That can be the problem.
Hi Thanks, I think It should be eth0, its container internal interface it doesn't have the "eno1". Something with DHCP doesn't work, the container doesn't get the DHCP address.
@mmishael hi,
You can try setting these two options for "Kubernetes Settings". You should choose the corresponding value based on your network card. enp4s0 is my PCIe 10 Gigabit network card, and the network card that comes with your device is generally called "eno1". Please also fill in your own gateway address for the "Route v4 Gateway" option below
The description I provided in the previous screenshot was not clear enough. The value of "VM_NET_DEV" is the name of the network card in the container, not the name of the network card in Truenas. I speculate that your environment should be filled in with "eth0", and "Host Interface" should be selected with "eno1 'Interface".
You can use @wy19xx to contact me. Without @, I shouldn't receive GitHub's email notification.
@wy19xx The container automaticly detects the network interface name inside the container:
VM_NET_DEV=$(awk '$2 == 00000000 { print $1 }' /proc/net/route)
If this method does not work correctly in your case, it would be better to improve this auto-detection code, than overriding the name manually using an environmeny variable.
@kroese In my case the VM_NET_DEV was not detected correctly. So, I had to set it manually:
- name: VM_NET_DEV
value: "net2"
I think that's because the VM had few interfaces, which were enabled to access WAN, but only one was reachable from outside.
Should the vdsm
be configured to listen on all interfaces by default?
@SlavikCA Normally under Docker the interface inside the container is always called eth0
(no matter how your real interface is called on the host). The only reason why I added this auto-detection is that under Podman rootless container this interface is called tap0
.
I never saw names like net2
before under Docker, so it must be something specific for Kubernetes. If the name is always net2
on every installation of Kubernetes, we could just add a simple check if net2
exists, and if so, use that name. That way it would work without having to manually set it and it would not affect normal Docker users.
@wy19xx The container automaticly detects the network interface name inside the container:
VM_NET_DEV=$(awk '$2 == 00000000 { print $1 }' /proc/net/route)
If this method does not work correctly in your case, it would be better to improve this auto-detection code, than overriding the name manually using an environmeny variable.
Yes, I used to guess it was automatically detected, so there shouldn't be a need to fill in when there was only one network card, but I have two network cards. Previously, when I didn't fill in this item and started the container, an error message would be displayed. I have sent out all my configurations without any changes in order to improve the success rate of the configuration.
If the name is always net2 on every installation of Kubernetes...
No, net2
is not typical.
It's actually came from my annotation:
annotations:
k8s.v1.cni.cncf.io/networks: '[{"name":"net2","namespace":"default","interface":"net2"}]'
I'm not the network or Kubernetes expert, but here is my understanding of how it works in Kubernetes:
The default scenario:
vdsm
will correctly attach to that interfacevdsm
user need to add Service: NodePort, LoadBalancer, ...Put vdsm
into LAN
multus
installed in Kubernetes cluster.vdsm
will listen on that LAN interface - the network discovery will work in LAN. Also, ports can be forwarded on the router, and no Services are needed on Kubernetes cluster.Yes, normally there will always be 1 interface that has internet connection, so my command:
awk '$2 == 00000000 { print $1 }' /proc/net/route
just selects the first one I guess, and doesnt account for the situation where there can be multiple.
So if one of you can get terminal access inside the running container ( in Portainer its called "attach to console" ) then it would be helpful if you could execute this command:
cat /proc/net/route
and post the output here. Because if the KubernetesIP always has a certain range (always starts with 123.x.x.x for example) we can use that info to ignore that interface.
If configured with 2 interfaces, here is what I see:
# kubectl exec -it -n dsm dsm -- cat /proc/net/route
Iface Destination Gateway Flags RefCnt Use Metric Mask MTU Window IRTT
eth0 00000000 0101FEA9 0003 0 0 0 00000000 0 0 0
eth0 0101FEA9 00000000 0005 0 0 0 FFFFFFFF 0 0 0
# kubectl exec -it -n dsm dsm -- ip a
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
inet 127.0.0.1/8 scope host lo
valid_lft forever preferred_lft forever
inet6 ::1/128 scope host
valid_lft forever preferred_lft forever
3: eth0@if108: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1450 qdisc noqueue state UP group default qlen 1000
link/ether 02:2d:64:71:e8:1d brd ff:ff:ff:ff:ff:ff link-netnsid 0
inet 10.52.3.236/32 scope global eth0
valid_lft forever preferred_lft forever
inet6 fe80::2d:64ff:fe71:e81d/64 scope link
valid_lft forever preferred_lft forever
4: net2@if109: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default
link/ether f6:7e:e4:e7:04:65 brd ff:ff:ff:ff:ff:ff link-netnsid 0
inet6 fe80::f47e:e4ff:fee7:465/64 scope link
valid_lft forever preferred_lft forever
5: dsm@net2: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP group default qlen 500
link/ether 02:11:32:70:4e:69 brd ff:ff:ff:ff:ff:ff
inet6 fe80::11:32ff:fe70:4e69/64 scope link
valid_lft forever preferred_lft forever
Mmhh.. This net2
is not present in /proc/net/route
at all. And in ip a
it does not even show a IPv4 address (only IPv6 somehow). Also the eth0
has a very normal looking IP (10.52.3.236) that is not exclusively used by Kubernetes.
So I dont have any idea how I could detect that it should not use eth0
in this case.
@kroese This is my output:
# cat /proc/net/route
Iface Destination Gateway Flags RefCnt Use Metric Mask MTU Window IRTT
eth0 00000000 010010AC 0003 0 0 0 00000000 0 0 0
eth0 000010AC 00000000 0001 0 0 0 0000FFFF 0 0 0
net1 0001A8C0 00000000 0001 0 0 0 00FFFFFF 0 0 0
# ip a
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
inet 127.0.0.1/8 scope host lo
valid_lft forever preferred_lft forever
inet6 ::1/128 scope host proto kernel_lo
valid_lft forever preferred_lft forever
3: eth0@if11: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default
link/ether fa:4f:52:dc:85:c3 brd ff:ff:ff:ff:ff:ff link-netnsid 0
inet 172.16.0.114/16 brd 172.16.255.255 scope global eth0
valid_lft forever preferred_lft forever
inet6 fe80::f84f:52ff:fedc:85c3/64 scope link proto kernel_ll
valid_lft forever preferred_lft forever
4: net1@if3: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default
link/ether b2:ca:5f:0d:ac:60 brd ff:ff:ff:ff:ff:ff link-netnsid 0
inet 192.168.1.185/24 brd 192.168.1.255 scope global net1
valid_lft forever preferred_lft forever
inet6 fe80::b0ca:5fff:fe0d:ac60/64 scope link proto kernel_ll
valid_lft forever preferred_lft forever
5: dsm@if3: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc fq_codel state UP group default qlen 500
link/ether 02:11:32:28:79:05 brd ff:ff:ff:ff:ff:ff link-netnsid 0
inet6 fe80::11:32ff:fe28:7905/64 scope link proto kernel_ll
valid_lft forever preferred_lft forever
@wy19xx Thanks. The whole problem is that Kubernetes makes eth0
the default interface inside the container. I read that Docker sorts the interfaces on alphabetical order when deciding which one becomes the default, so maybe because the letter E(th0) is lower than the letter N(et2) it picks the wrong one. And its not a good idea to always pick the non-default one when there are two interfaces, because that might break networking for other people.
So the only fix I could do is to check if a device called net0
, net1
, net2
or net3
exists, and in that case always prefer them above the default interface. But as @SlavikCA already pointed out, this is not foolproof because people can give them other names than netX
.
Anyways, I released a new image (v7.13) that includes this fix now.
@mmishael hi,
![]()
You can try setting these two options for "Kubernetes Settings". You should choose the corresponding value based on your network card. enp4s0 is my PCIe 10 Gigabit network card, and the network card that comes with your device is generally called "eno1". Please also fill in your own gateway address for the "Route v4 Gateway" option below
The description I provided in the previous screenshot was not clear enough. The value of "VM_NET_DEV" is the name of the network card in the container, not the name of the network card in Truenas. I speculate that your environment should be filled in with "eth0", and "Host Interface" should be selected with "eno1 'Interface".
You can use @wy19xx to contact me. Without @, I shouldn't receive GitHub's email notification.
Hi @wy19xx,
I appreciate your efforts. I have tried to do the steps you mentioned and it didn't work for me, it was still redirected to the "The location of DSM is http://169.254.182.247:5000/" no my network 10.10.x.x.
As you mentioned, I can see my interfaces in the truenas dashboard, and it's "eno1", I gave my IPV4 gateway and still no change.
Please see uploaded log, not sure it can help ....
@mmishael hi,
![]()
You can try setting these two options for "Kubernetes Settings". You should choose the corresponding value based on your network card. enp4s0 is my PCIe 10 Gigabit network card, and the network card that comes with your device is generally called "eno1". Please also fill in your own gateway address for the "Route v4 Gateway" option below The description I provided in the previous screenshot was not clear enough. The value of "VM_NET_DEV" is the name of the network card in the container, not the name of the network card in Truenas. I speculate that your environment should be filled in with "eth0", and "Host Interface" should be selected with "eno1 'Interface". You can use @wy19xx to contact me. Without @, I shouldn't receive GitHub's email notification.
Hi @wy19xx,
I appreciate your efforts. I have tried to do the steps you mentioned and it didn't work for me, it was still redirected to the "The location of DSM is http://169.254.182.247:5000/" no my network 10.10.x.x.
As you mentioned, I can see my interfaces in the truenas dashboard, and it's "eno1", I gave my IPV4 gateway and still no change.
Please see uploaded log, not sure it can help ....
@mmishael Hi, I'm sorry to say that the above configuration is all I have, I know nothing about the technical details of obtaining IPs, and these configurations are the result of my attempts to work through the permutations and combinations of the various options. @SlavikCA has committed the Kubernetes configuration file to the master branch, and @kroese has refined the auto-detection of the VM_NET_DEV configuration, so perhaps you could try re-pulling the latest image and rebuilding the container again?
I have Truenas scale, Docker was removed from the base OS. The apps system on scale is k3s, But k3s switched form docker to containerd as runtime, so docker was removed.
I'm not familiar with k3s, And tired to convert the compose file to k3s, but was unable to do so, tried to deploy it via portioner, and it failed.
Could you post k3s yaml to deploy via portioner? Thanks