PKUHPC / OpenSCOW

Super Computing On Web
https://www.pkuscow.com/
Mulan Permissive Software License, Version 2
221 stars 50 forks source link

[Help] portal-web: 用户新建桌面连接失败 #1376

Closed zhengkang2020 closed 1 month ago

zhengkang2020 commented 4 months ago

发生了什么 | What happened

其他用户使用正常,只有用户A新建桌面失败,偶尔会有提示"无法以用户身份连接到登录节点。请确认您的家目录的权限为700、750或者755" ,但是检查和其他用户的权限是一样的。

portal-web 日志:

Error: connect ECONNREFUSED 192.168.55.83:5901
scow-portal-web-1  |     at TCPConnectWrap.afterConnect [as oncomplete] (node:net:1606:16) {
scow-portal-web-1  |   errno: -111,
scow-portal-web-1  |   code: 'ECONNREFUSED',
scow-portal-web-1  |   syscall: 'connect',
scow-portal-web-1  |   address: '192.168.55.83',
scow-portal-web-1  |   port: 5901
scow-portal-web-1  | } Error when proxing WS requests
scow-portal-web-1  | Error: connect ECONNREFUSED 192.168.55.83:5901
scow-portal-web-1  |     at TCPConnectWrap.afterConnect [as oncomplete] (node:net:1606:16) {
scow-portal-web-1  |   errno: -111,
scow-portal-web-1  |   code: 'ECONNREFUSED',
scow-portal-web-1  |   syscall: 'connect',
scow-portal-web-1  |   address: '192.168.55.83',
scow-portal-web-1  |   port: 5901
scow-portal-web-1  | } Error when proxing WS requests
scow-portal-web-1  | Error: read ECONNRESET
scow-portal-web-1  |     at TCP.onStreamRead (node:internal/stream_base_commons:218:20) {
scow-portal-web-1  |   errno: -104,
scow-portal-web-1  |   code: 'ECONNRESET',
scow-portal-web-1  |   syscall: 'read'
scow-portal-web-1  | } Error when proxing WS requests

登陆节点message日志:

Jul 25 16:38:04 hpc-login dbus-daemon[776381]: [session uid=300024 pid=776376] Activating service name='org.freedesktop.systemd1' requested by ':1.0' (uid=300024 pid=776421 comm="systemctl --user import-environ
ment DISPLAY XAUTHO" label="kernel")
Jul 25 16:38:04 hpc-login dbus-daemon[776381]: [session uid=300024 pid=776376] Activated service 'org.freedesktop.systemd1' failed: Process org.freedesktop.systemd1 exited with status 1

vnc日志:

TurboVNC Server (Xvnc) 64-bit v3.1.1 (build 20240127)
Copyright (C) 1999-2024 The VirtualGL Project and many others (see README.md)
Visit http://www.TurboVNC.org for more information on TurboVNC

25/07/2024 16:41:22 Using security configuration file /etc/turbovncserver-security.conf
25/07/2024 16:41:22 Enabled security type 'otp'
25/07/2024 16:41:22 Desktop name 'desktop-20240725-164115' (hpc-login:1)
25/07/2024 16:41:22 Protocol versions supported: 3.3, 3.7, 3.8, 3.7t, 3.8t
25/07/2024 16:41:22 Listening for VNC connections on TCP port 5901
25/07/2024 16:41:22   Interface 0.0.0.0
25/07/2024 16:41:22 Framebuffer: BGRX 8/8/8/8
25/07/2024 16:41:22 New desktop size: 1240 x 900
25/07/2024 16:41:22 New screen layout:
25/07/2024 16:41:22   0x00000040 (output 0x00000040): 1240x900+0+0
25/07/2024 16:41:22 Maximum clipboard transfer size: 1048576 bytes
25/07/2024 16:41:22 VNC extension running!
xstartup.turbovnc: Creating new session bus instance:
xstartup.turbovnc:   unix:abstract=/tmp/dbus-RcwzjVXQk5,guid=a854043238fd32e3176a9bc566a20fb3
xstartup.turbovnc: Using 'xfce' window manager in
xstartup.turbovnc:   /usr/share/xsessions/xfce.desktop
xstartup.turbovnc: Executing /etc/X11/xinit/Xsession "startxfce4"
Environment variable $XAUTHORITY not set, ignoring.
Failed to import environment: Process org.freedesktop.systemd1 exited with status 1
/bin/startxfce4: X server already running on display :11.0
xrdb: Connection refused
xrdb: Can't open display ':11.0'
xfce4-session: Cannot open display: .
Type 'xfce4-session --help' for usage.
Killing Xvnc process ID 777453

运行环境 | Environment

- OS: Rocky Linxu 9.4
- Scheduler: 23.02.6
- Docker: 24.0.7
- Docker-compose: 2.23.3
- SCOW cli: 1.6.1
- SCOW: 1.6.1
- Adapter: 自己编译版本
Miracle575 commented 1 month ago

可以尝试删除 用户家目录/scow/desktops/desktops.json 文件再看看能否正常创建。此外可以提供 portal-server 日志方便排查问题

zhengkang2020 commented 1 month ago

问题依旧,下面是日志:

portal-server 日志

scow-portal-server-1  | {"level":30,"time":"2024-10-14T09:28:08.448Z","pid":18,"hostname":"b219a7f3ae3a","msg":"Checking activation status of clusters with ids ([\"hpc01\"]) "}
scow-portal-server-1  | {"level":30,"time":"2024-10-14T09:28:09.421Z","pid":18,"hostname":"b219a7f3ae3a","req":"1vrl","path":"/scow.portal.DesktopService/CreateDesktop","msg":"Command execCommand sudo  -u jiaweitang -s /opt/TurboVNC/bin/vncserver -list, options %o"}
scow-portal-server-1  | {"level":30,"time":"2024-10-14T09:28:11.372Z","pid":18,"hostname":"b219a7f3ae3a","req":"1vrl","path":"/scow.portal.DesktopService/CreateDesktop","msg":"Command execCommand sudo  -u jiaweitang -s /opt/TurboVNC/bin/vncserver -securitytypes OTP -otp -wm 2d -name desktop-20241014-172806, options %o"}
scow-portal-server-1  | {"level":30,"time":"2024-10-14T09:28:11.883Z","pid":18,"hostname":"b219a7f3ae3a","req":"1vrl","path":"/scow.portal.DesktopService/CreateDesktop","msg":"Command execCommand eval echo ~jiaweitang, options %o"}
scow-portal-server-1  | {"level":30,"time":"2024-10-14T09:28:12.505Z","pid":18,"hostname":"b219a7f3ae3a","req":"1vrl","path":"/scow.portal.DesktopService/CreateDesktop","msg":"Command execCommand sudo  -u jiaweitang -s mkdir -p /data/home/jiaweitang/scow/desktops, options %o"}
scow-portal-server-1  | {"level":30,"time":"2024-10-14T09:28:13.534Z","pid":18,"hostname":"b219a7f3ae3a","req":"1vrl","path":"/scow.portal.DesktopService/CreateDesktop","msg":"Desktop desktop-20241014-172806 added"}
scow-portal-server-1  | {"level":30,"time":"2024-10-14T09:28:13.534Z","pid":18,"hostname":"b219a7f3ae3a","req":"1vrl","path":"/scow.portal.DesktopService/CreateDesktop","msg":"Request completed."}
scow-portal-server-1  | {"level":30,"time":"2024-10-14T09:28:13.699Z","pid":18,"hostname":"b219a7f3ae3a","req":"1vrm","path":"/scow.common.ConfigService/GetClusterConfigFiles","msg":"Starting request"}
scow-portal-server-1  | {"level":30,"time":"2024-10-14T09:28:13.735Z","pid":18,"hostname":"b219a7f3ae3a","req":"1vrm","path":"/scow.common.ConfigService/GetClusterConfigFiles","msg":"Request completed."}
scow-portal-server-1  | {"level":30,"time":"2024-10-14T09:28:13.748Z","pid":18,"hostname":"b219a7f3ae3a","req":"1vrn","path":"/scow.portal.DesktopService/ListUserDesktops","msg":"Starting request"}
scow-portal-server-1  | {"level":30,"time":"2024-10-14T09:28:13.758Z","pid":18,"hostname":"b219a7f3ae3a","msg":"Checking activation status of clusters with ids ([\"hpc01\"]) "}
scow-portal-server-1  | {"level":30,"time":"2024-10-14T09:28:14.728Z","pid":18,"hostname":"b219a7f3ae3a","req":"1vrn","path":"/scow.portal.DesktopService/ListUserDesktops","msg":"Command execCommand sudo  -u jiaweitang -s /opt/TurboVNC/bin/vncserver -list, options %o"}
scow-portal-server-1  | {"level":30,"time":"2024-10-14T09:28:15.293Z","pid":18,"hostname":"b219a7f3ae3a","req":"1vrn","path":"/scow.portal.DesktopService/ListUserDesktops","msg":"Command execCommand eval echo ~jiaweitang, options %o"}
scow-portal-server-1  | {"level":30,"time":"2024-10-14T09:28:15.857Z","pid":18,"hostname":"b219a7f3ae3a","req":"1vro","path":"/scow.common.ConfigService/GetClusterConfigFiles","msg":"Starting request"}
scow-portal-server-1  | {"level":30,"time":"2024-10-14T09:28:15.888Z","pid":18,"hostname":"b219a7f3ae3a","req":"1vro","path":"/scow.common.ConfigService/GetClusterConfigFiles","msg":"Request completed."}
scow-portal-server-1  | {"level":30,"time":"2024-10-14T09:28:15.932Z","pid":18,"hostname":"b219a7f3ae3a","req":"1vrn","path":"/scow.portal.DesktopService/ListUserDesktops","msg":"Command execCommand sudo  -u jiaweitang -s mkdir -p /data/home/jiaweitang/scow/desktops, options %o"}
scow-portal-server-1  | {"level":30,"time":"2024-10-14T09:28:16.409Z","pid":18,"hostname":"b219a7f3ae3a","req":"1vrn","path":"/scow.portal.DesktopService/ListUserDesktops","msg":"Request completed."}
scow-portal-server-1  | {"level":30,"time":"2024-10-14T09:28:21.806Z","pid":18,"hostname":"b219a7f3ae3a","req":"1vrp","path":"/scow.common.ConfigService/GetClusterConfigFiles","msg":"Starting request"}
scow-portal-server-1  | {"level":30,"time":"2024-10-14T09:28:21.846Z","pid":18,"hostname":"b219a7f3ae3a","req":"1vrp","path":"/scow.common.ConfigService/GetClusterConfigFiles","msg":"Request completed."}
scow-portal-server-1  | {"level":30,"time":"2024-10-14T09:28:26.371Z","pid":18,"hostname":"b219a7f3ae3a","req":"1vrq","path":"/scow.portal.AppService/ListAppSessions","msg":"Starting request"}

scow/desktops/desktops.json清空后重新生成的文件:

[{"host":"hpc-login","displayId":1,"desktopName":"desktop-20241014-172806","wm":"2d","createTime":"2024-10-14T09:28:11.374Z"}]
Miracle575 commented 1 month ago

scow 节点能访问到 192.168.55.83:5901 吗?

zhengkang2020 commented 1 month ago

目前内网测试环境,防火墙关闭,正常连接192.168.55.83:5901

[root@hpc-scow ~]# nc -zv 192.168.55.83 5901
Ncat: Version 7.92 ( https://nmap.org/ncat )
Ncat: Connected to 192.168.55.83:5901.
Ncat: 0 bytes sent, 0 bytes received in 0.05 seconds.
tongchong commented 1 month ago

如果只有单个用户有问题的话,请去该用户的文件夹下查看他是否安装了其它软件,或者修改了某些环境变量,可能与此有关

zhengkang2020 commented 1 month ago

问题已经解决,经确认是用户环境变量 .bashrc文件最后多了一行 export DISPLAY=:11.0 ,以下是具体步骤:

## 1、修改用户环境变量,用户自己操作
cd && cp .bashrc .bashrc.bak && sed -i '/^export DISPLAY=:11.0/s/^/#/' ~/.bashrc && source .bashrc
## 2、root权限登陆"登陆节点",注销用户会话,然后用户重新测试
pkill -u xxxx

感谢 @tongchong