abcdesktopio / oc.user

abcdesktop main graphical user container
GNU General Public License v2.0
3 stars 2 forks source link

Storage restriction on each user pod #52

Closed chintus777 closed 1 year ago

chintus777 commented 1 year ago

Hello Alexendre,

Really amazing project, I am currently testing the VDI. I want to restrict usage for each user pod, in my current scenario each user can use full nfs storage available . Currently i am using dynamic provisioning using storageclass and pvc spec in od.config. Also I have tried deploying without using storageclass by providing the mount path directly in pv specs in od.config , but that didn't work as well .

Inside od.config -

desktop.persistentvolumespec:None

desktop.persistentvolumeclaimspec: { 'storageClassName': 'nfs-csi', 'resources': { 'requests': { 'storage': '10Gi' }, 'limits': { 'storage': '10Gi' } }, 'accessModes': [ 'ReadWriteMany' ] }

my storage class yaml

apiVersion: storage.k8s.io/v1 kind: StorageClass metadata: name: nfs-csi provisioner: nfs.csi.k8s.io parameters: server: x.x.x.x share: /nfs-data-drive reclaimPolicy: Delete volumeBindingMode: Immediate mountOptions:

in my nfs server pv are created user wise -

pvc-10f527f7-7760-4041-9c87-437c008bebef pvc-77a0f35d-ba33-4de8-9f90-0094a263f73f

pvc-1b4ec935-2c55-4243-9ef5-4cbd419ad37a pvc-8e1da3ff-6006-435e-ae24-5a6f0a76282f

pvc-207fd8b3-7794-4f94-9bf1-7922cdb8da44 pvc-90fb2435-3c41-40f2-b391-6b3b047e4b8f

pvc-21da009e-775f-42f4-86a0-ad85f9cef08a pvc-a0465555-2218-4624-b295-32004f1896e8

pvc-22b1a862-fb24-4181-83f7-866bdbe1dbf2 pvc-aed8e7d8-5836-44ea-865d-e98dc4216213

pvc-231709fe-461f-4b71-a7c5-e4c65856c484 pvc-b53c6acf-c6d0-4449-98b8-9eabf9ae75a6

pvc-2a475a80-391c-4f6b-8e53-49d4372b8cc3 pvc-c37274c0-9a30-424b-ba43-cc74d9d62b84

pvc-3a3d9e46-663b-477f-a07d-b98e20ed10a9 pvc-e594628f-d427-47a5-9c21-bf9a860699ef

indise a user pod - when a do df -h , it shows full space available of nfs -server to each user, but i want to restrict each user space to 10 Gi , how can I achieve this

Filesystem Size Used Avail Use% Mounted on overlay 117G 73G 44G 63% / tmpfs 64M 0 64M 0% /dev tmpfs 9.8G 0 9.8G 0% /sys/fs/cgroup tmpfs 8.0G 8.0K 8.0G 1% /tmp tmpfs 4.0G 0 4.0G 0% /dev/shm tmpfs 20G 0 20G 0% /tmp/.X11-unix tmpfs 20G 0 20G 0% /tmp/.pulseaudio tmpfs 20G 0 20G 0% /tmp/.cupsd /dev/vda1 117G 73G 44G 63% /etc/hosts tmpfs 7.7M 0 7.7M 0% /run/user x.x.x.x:/nfs-data-drive/pvc-231709fe-461f-4b71-a7c5-e4c65856c484 1007G 41G 915G 5% /home/balloon tmpfs 7.7M 0 7.7M 0% /run/dbus tmpfs 7.7M 52K 7.6M 1% /var/log/desktop tmpfs 980K 16K 964K 2% /run/desktop tmpfs 20G 44K 20G 1% /var/secrets/abcdesktop/localaccount tmpfs 20G 4.0K 20G 1% /var/secrets/abcdesktop/vnc tmpfs 9.8G 0 9.8G 0% /proc/acpi tmpfs 9.8G 0 9.8G 0% /proc/scsi tmpfs 9.8G 0 9.8G 0% /sys/firmware

chintus777 commented 1 year ago

Hello Alexandre ,

Could u find any possible solutions for the problem , I have also changed my nfs driver to iscsi and ceph, but I am facing the same issue .Is there any other way of achieving restriction on storage of each user pod so that I can increase or decrease storage allocated to a user dynamically ? Please reply as soon as possible. Thanks

alexandredevely commented 1 year ago

Hello chintus777,

Thank you for your message chintus, in my point of view, this problem doesn't concern abcdesktop.io projet. I mean this is more a storage class quotas topic for kubernetes.

We have the same issue with a simple pod without using abcdesktop.io. You just need to set restriction on storage of each pod. You may have to set quotas on your nfs-server side, this is an old school design but it should work.

About you dh -h

The df -h command doesn't return the limits, and never returns the limits. Try to reach the limit using your resources, don't trust the dh results.

'resources': {
'requests': { 'storage': '10Gi' },
'limits': { 'storage': '10Gi' }
}

Try to create a big file ( I mean more than 10 Gi in this case), and check if it works.

If it doesn't work, find and change the csi driver from https://kubernetes-csi.github.io/docs/drivers.html with quotas support.

See you

Alexandre

chintus777 commented 1 year ago

Hello Alexandre , Thanks a lot for your valuable suggestions. After trying different drivers , I found cinder driver that can be used for storage restriction on each pod , also by this driver I can expand and decrease the pv volume of a user pod dynamically by editing the pvc yaml file .

Below is the od.config which I used

desktop.homedirectorytype: 'persistentVolumeClaim' desktop.persistentvolumespec: None desktop.persistentvolumeclaimspec: {    'storageClassName': 'csi-sc-cinder',    'resources': {       'requests': { 'storage': '2Gi' }, 'limits': { 'storage': '2Gi' } },    'accessModes': [ 'ReadWriteMany' ] }

Below is the storage class yaml

apiVersion: storage.k8s.io/v1 kind: StorageClass metadata: name: csi-sc-cinder provisioner: cinder.csi.openstack.org parameters: availability: nova allowVolumeExpansion: true volumeBindingMode: Immediate

Below is ss inside a user pod using df -h

image

Now each user can only use the storage provided in od.config and that is visible inside the user pod as well.

Now I am facing an issue -

When I try to login a new user after entering the credentials , I need to refresh the login page 2 times , only then after user gets logged in. After entering the credentials , login page gets stuck at pv creation instead of entering into VDI with below message

image

Now after I refresh the page , user gets logged in , sometime it gets disconnected and then after refreshing the page again , it gets logged in .

I haven't faced this issue before ,earlier after pressing sign in , user logged in directly and I didn't need to refresh the page .

Please guide me in solving this issue as soon as possible.

Thanks

alexandredevely commented 1 year ago

Hello chintus,

This is a smart point. You've done a good job with the cinder driver, so nice. Thank you for this feedback and for the files od.config and storageclass.yaml It's look good.

When you try to login a new user after entering the credentials , you need to refresh the login page 2 times , only then after user gets logged in.

We need to track the create pod process.

Can you try to create a pod without using abcdesktop but with the same cinder driver. The goal is to get an idea of the driver's overhead.

Can you get the results of

kubectl get logs $YOUR_POD -n abcdesktop

When you're using abcdesktop, I suppose (but we need to confirm) that the cinder driver returns OK, and the user's pod can't use it and have to wait to attach the volume.

This issue should come from come from cinder or from the method def createdesktop in class ODOrchestratorKubernetes(ODOrchestrator) file oc/od.orchtestrator.py createdesktop

Could you read the pyos logs, this should be the main step

kubectl get pods -l  name=pyos-od  -n abcdesktop

And find line by line in the debug log file, the step where pyos declare the pod as ready ( or not ? ) This is really important to troubleshoot this case.

After entering the credentials , login page gets stuck at pv creation instead of entering into VDI with below message image  Now after I refresh the page , user gets logged in , sometime it gets disconnected and then after refreshing the page again , it gets logged in .

If the user is disconnected after the login, it means that x11vnc didn't start

Could you get log files ins /var/log/desktop, and

fry:/var/log/desktop$ ls -la
total 144
drwxrwxrwt 2 root root   260 Aug 21 16:44 .
drwxr-xr-x 1 root root  4096 Aug  1 10:26 ..
-rw------- 1 fry  fry  95112 Aug 21 16:44 broadcast-service-stdout---supervisor-qzfmjwgm.log
-rw-r----- 1 fry  fry    218 Aug 21 16:44 config.log
-rw-r--r-- 1 fry  fry   5383 Aug 21 16:44 docker-entrypoint-pulseaudio.log
-rw-r----- 1 fry  fry    105 Aug 21 16:44 openbox_autostart.log
-rw------- 1 fry  fry     85 Aug 21 16:44 openbox-stdout---supervisor-ayt6jevf.log
-rw------- 1 fry  fry   4952 Aug 21 16:44 spawner-service-stdout---supervisor-frijql6l.log
-rw-r----- 1 fry  fry   2104 Aug 21 16:44 supervisord.log
-rw-r----- 1 fry  fry    477 Aug 21 16:44 websockify.log
-rw-r----- 1 fry  fry   1144 Aug 21 16:44 xserver.log
-rw------- 1 fry  fry      0 Aug 21 16:44 xsettingsd-stdout---supervisor-4er_stj4.log
-rw------- 1 fry  fry    356 Aug 21 16:44 xterm.js-stdout---supervisor-hj_s7sea.log

Look at the both files websockify.log and xserver.log, you should read

fry:/var/log/desktop$ more xserver.log 
X11LISTEN=tcp
Send clipboard changes to clients
Accept clipboard updates from clients
XVNC_PARAMS=-listen tcp -interface 10.244.0.40
CONTAINER_IP_ADDR=10.244.0.40

Xvnc TigerVNC 1.13.1 - built Mar  4 2023 00:19:38
Copyright (C) 1999-2022 TigerVNC Team and many others (see README.rst)
See https://www.tigervnc.org for information on TigerVNC.
Underlying X server release 12101003

Mon Aug 21 16:44:12 2023
 vncext:      VNC extension running!
 vncext:      Listening for VNC connections on /tmp/.x11vnc (mode 0600)
 vncext:      created VNC server for screen 0
[mi] mieq: warning: overriding existing handler (nil) with 0x563f5a118fc0 for event 2
[mi] mieq: warning: overriding existing handler (nil) with 0x563f5a118fc0 for event 3

Mon Aug 21 16:44:19 2023
 Connections: accepted: /tmp/.x11vnc
 SConnection: Client needs protocol version 3.8
 SConnection: Client requests security type VncAuth(2)
 VNCSConnST:  Server default pixel format depth 24 (32bpp) little-endian rgb888
 VNCSConnST:  Client pixel format depth 24 (32bpp) little-endian bgr888
 ComparingUpdateTracker: 0 pixels in / 0 pixels out
 ComparingUpdateTracker: (1:-nan ratio)

websockify.log

fry:/var/log/desktop$ more websockify.log 
waiting for socket /tmp/.x11vnc
.DISABLE_REMOTEIP_FILTERING is disabled, listening 10.244.0.40:6081
WEBSOCKIFY_HEARTBEAT=30
HEARTBEAT_OPTION=--heartbeat=30
WebSocket server settings:
  - Listen on 10.244.0.40:6081
  - No SSL/TLS support (no cert file)
  - proxying from 10.244.0.40:6081 to /tmp/.x11vnc
10.244.0.11 - - [21/Aug/2023 16:44:19] 10.244.0.11: Plain non-SSL (ws://) WebSocket connection
10.244.0.11 - - [21/Aug/2023 16:44:19] connecting to unix socket: /tmp/.x11vnc

I haven't faced this issue before ,earlier after pressing sign in , user logged in directly and I didn't need to refresh the page

I have faced this issue before when the storage class or driver was wrong, user's home dir takes long time to get ready and mounted. Does your cinder driver support Posix commands like chmod and chown ? If Not, you may have to change the init command of the pod in od.config.

See you soon, your design become technically very interesting

Alexandre

chintus777 commented 1 year ago

Hello Alexandre , thanks a lot for your quick response

I analyzed the points provided by you.

After reading from the pyos logs , I could see some error related to orchestrator file while creating desktop.

image

After refreshing the page following events occurred and then user logged in

image

Also below are the logs of user pod -

Defaulted container "x-planet-ess" out of: x-planet-ess, c-planet-ess, s-planet-ess, f-planet-ess, o-planet-ess, i-planet-ess (init) bash: warning: setlocale: LC_ALL: cannot change locale (en_US.UTF-8) NAMESPACE=vdi X11LISTEN=tcp Container local ip addr is 192.168.79.166 uid=4096(balloon) gid=4096(balloon) groups=4096(balloon),27(sudo) vnc password use kubernetes secret remove ~/.Xauthority file create ~/.Xauthority file create ~/.store directory create ~/Desktop directory create ~/.config directory create ~/.config/pulse/cookie Error setting cipher RC4 40E79B8F927F0000:error:0308010C:digital envelope routines:inner_evp_generic_fetch:unsupported:../crypto/evp/evp_fetch.c:349:Global default library context, Algorithm (RC4 : 37), Properties () create ~/.config/autostart directory create ~/.config/nautilus directory create ~/.gtkrc-2.0 file /home/balloon/.wallpapers does not exist create /home/balloon/.wallpapers copy new wallpaper files in /home/balloon/.wallpapers run xdg-user-dirs-update DEFAULT_WALLPAPER=image.png /home/balloon/.wallpapers/image.png file exists Define wallpaper as image.png to /home/balloon/.config/current_wallpaper SET_DEFAULT_COLOR is not defined, keep default value 2023-08-22 10:19:58,362 INFO Included extra file "/etc/supervisor/conf.d/broadcast-service.conf" during parsing 2023-08-22 10:19:58,362 INFO Included extra file "/etc/supervisor/conf.d/novnc.conf" during parsing 2023-08-22 10:19:58,362 INFO Included extra file "/etc/supervisor/conf.d/openbox.conf" during parsing 2023-08-22 10:19:58,362 INFO Included extra file "/etc/supervisor/conf.d/spawner-service.conf" during parsing 2023-08-22 10:19:58,362 INFO Included extra file "/etc/supervisor/conf.d/tigervnc.conf" during parsing 2023-08-22 10:19:58,362 INFO Included extra file "/etc/supervisor/conf.d/xsettingsd.conf" during parsing 2023-08-22 10:19:58,362 INFO Included extra file "/etc/supervisor/conf.d/xterm.conf" during parsing 2023-08-22 10:19:58,365 INFO RPC interface 'supervisor' initialized 2023-08-22 10:19:58,366 CRIT Server 'unix_http_server' running without any HTTP authentication checking 2023-08-22 10:19:58,366 INFO supervisord started with pid 50 2023-08-22 10:19:59,368 INFO spawned: 'xserver' with pid 100 2023-08-22 10:19:59,370 INFO spawned: 'spawner-service' with pid 101 2023-08-22 10:19:59,371 INFO spawned: 'broadcast-service' with pid 102 2023-08-22 10:19:59,373 INFO spawned: 'novnc' with pid 103 2023-08-22 10:19:59,374 INFO spawned: 'openbox' with pid 104 2023-08-22 10:19:59,376 INFO spawned: 'xterm.js' with pid 106 2023-08-22 10:20:00,396 INFO success: spawner-service entered RUNNING state, process has stayed up for > than 1 seconds (startsecs) 2023-08-22 10:20:00,396 INFO success: broadcast-service entered RUNNING state, process has stayed up for > than 1 seconds (startsecs) 2023-08-22 10:20:00,396 INFO success: xterm.js entered RUNNING state, process has stayed up for > than 1 seconds (startsecs) 2023-08-22 10:20:01,417 INFO success: xserver entered RUNNING state, process has stayed up for > than 2 seconds (startsecs) 2023-08-22 10:20:01,418 INFO success: novnc entered RUNNING state, process has stayed up for > than 2 seconds (startsecs) 2023-08-22 10:20:01,418 INFO success: openbox entered RUNNING state, process has stayed up for > than 2 seconds (startsecs

Also , I am creating my user using my openldap server , I only use cn , sn , username , password , class while creating a new user in ldap

Also the desktop logs(websockify and xsocket) are same as provided by u.

Can u please suggest me what changes should I make so that user can login in one time or is there anyway so that user can wait more after clicking login but he gets logged in single time . Also am I missing certain parameter related to volume timeout in .py files ?

alexandredevely commented 1 year ago

Hello chintus,

Thank you for your response.

The log files for your container x-planet-ess is good for me. This issue come from the createdesktop call in orchestrator file. This is a good point, you've got it, I hope. To fix it, I'm trying to create the same infra with OpenStack Cinder Volume as you have installed and configured.

Are you using a full OpenStack infra ? ( I mean with keystone and others services) or only a single cinder-volume service with auth_strategy = noauth can work ?

# cat /etc/cinder/cinder.conf
[DEFAULT]
rootwrap_config = /etc/cinder/rootwrap.conf
api_paste_confg = /etc/cinder/api-paste.ini
iscsi_helper = lioadm
volume_name_template = volume-%s
volume_group = cinder-volumes
verbose = True
auth_strategy = noauth
state_path = /var/lib/cinder
lock_path = /var/lock/cinder
volumes_dir = /var/lib/cinder/volumes
enabled_backends = lvm

[database]
connection = sqlite:////var/lib/cinder/cinder.sqlite

Do you know if csi-cinder-controllerplugin support auth_strategy = noauth ? What kind of information do you provide in your cloud.yaml file ?

Thank you so much

See you Alexandre

alexandredevely commented 1 year ago

Hello chintus,

Thank you again for your comment. The SuccessfulAttachVolume reason was tested has an error and not a success. The image abcdesktopio/oc.pyos:3.0 contains a fix to continue the read event process.

The digest for abcdesktopio/oc.pyos:3.0 is

Digest: sha256:d4554338c569a4d20444bbf4b41ca2e7e95f041b557de95ebf79569887ea0b0a

Please let me know if this patch fix the issue. I can only confirm that the first step of event reading is correct, and for the second one, we need your feedback.

See you,

Alexandre

alexandredevely commented 1 year ago

Hello chintus,

I've run some test with a nfs server to set quota value.

Run a dd command

Screenshot 2023-08-25 at 17 26 46

Run df command

Screenshot 2023-08-25 at 17 27 16

Steps :

The page https://www.abcdesktop.io/3.0/config/volumes describes the configuration guide using csi-driver-nfs, a ldap server with PosixAccount objectclass and a NFS server with quota.

I hope this can help you to set your quota

Alexandre

chintus777 commented 1 year ago

Hello Alexandre ,

Thanks a lot for your valuable suggestions , Currently I am using Cinder with full openstack infra with AUTH credentials as well, Now after updating pyos image , it is working fine , now the issue is resolved and I have achieved storage restriction and I can increase or decrease the allocated space per user dynamically as well .

For the NFS driver , I will test and get back to u soon.

Thanks Again Chintu

alexandredevely commented 1 year ago

Hi Chintu,

I appreciate your feedback and I’m glad to hear that this issue is solved. Thank you for this message and for using abcdesktop.

Alexandre