microsoft / vscode-remote-release

Visual Studio Code Remote Development: Open any folder in WSL, in a Docker container, or on a remote machine using SSH and take advantage of VS Code's full feature set.
https://aka.ms/vscode-remote
Other
3.69k stars 299 forks source link

code tunnel doesn't see to start VSCode server #8289

Closed zyzhang1992 closed 1 year ago

zyzhang1992 commented 1 year ago

Greetings,

I am running code tunnel on one of our clusters.

[zyzhang@sh02-ln02 login ~]$ ./code --version code-cli 1.75.1 (commit 441438abd1ac652551dbe4d408dfcec8a499b8bf)

When I run code tunnel, I can authenticate on github but then there doesn't seem to be a VSCode server started. After a while, it times out. When I follow the link, https://global.rel.tunnels.api.visualstudio.com/ I got an error of

404 Not Found nginx

[zyzhang@sh02-ln02 login ~]$ ./code tunnel --verbose *

The same procedure worked on two of the other clusters I have tested and it worked just fine on those two clusters. Where should a look for hints for the possible issues?

Thanks!

@bamurtaugh

bamurtaugh commented 1 year ago

The same procedure worked on two of the other clusters I have tested and it worked just fine on those two clusters.

Thanks for filing. Can you share the difference between the cluster it isn't work on vs those that it is working on? i.e. are they different architectures, etc?

zyzhang1992 commented 1 year ago

They are quite similar in that they all run the same CentoOS. Two clusters are at Stanford with the same 2FA, one worked and the other doesn't. The third one worked, and it doesn't use 2FA. I just noticed that the one that doesn't work is intel. I'll do some more test later.

Here is the one that doesn't work,

@.*** login ~]$ lscpu Architecture: x86_64 CPU op-mode(s): 32-bit, 64-bit Byte Order: Little Endian CPU(s): 32 On-line CPU(s) list: 0-31 Thread(s) per core: 2 Core(s) per socket: 8 Socket(s): 2 NUMA node(s): 2 Vendor ID: GenuineIntel CPU family: 6 Model: 79 Model name: Intel(R) Xeon(R) CPU E5-2620 v4 @ 2.10GHz Stepping: 1 CPU MHz: 1260.845 CPU max MHz: 3000.0000 CPU min MHz: 1200.0000 BogoMIPS: 4200.27 Virtualization: VT-x L1d cache: 32K L1i cache: 32K L2 cache: 256K L3 cache: 20480K NUMA node0 CPU(s): 0,2,4,6,8,10,12,14,16,18,20,22,24,26,28,30 NUMA node1 CPU(s): 1,3,5,7,9,11,13,15,17,19,21,23,25,27,29,31 Flags: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx pdpe1gb rdtscp lm constant_tsc arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc aperfmperf eagerfpu pni pclmulqdq dtes64 monitor ds_cpl vmx smx est tm2 ssse3 sdbg fma cx16 xtpr pdcm pcid dca sse4_1 sse4_2 x2apic movbe popcnt tsc_deadline_timer aes xsave avx f16c rdrand lahf_lm abm 3dnowprefetch epb cat_l3 cdp_l3 invpcid_single intel_pt ssbd ibrs ibpb stibp tpr_shadow vnmi flexpriority ept vpid fsgsbase tsc_adjust bmi1 hle avx2 smep bmi2 erms invpcid rtm cqm rdt_a rdseed adx smap xsaveopt cqm_llc cqm_occup_llc cqm_mbm_total cqm_mbm_local dtherm ida arat pln pts md_clear spec_ctrl intel_stibp flush_l1d

Here are the ones that worked: @.*** ~]$ lscpu Architecture: x86_64 CPU op-mode(s): 32-bit, 64-bit Byte Order: Little Endian CPU(s): 64 On-line CPU(s) list: 0-63 Thread(s) per core: 2 Core(s) per socket: 16 Socket(s): 2 NUMA node(s): 8 Vendor ID: AuthenticAMD CPU family: 23 Model: 1 Model name: AMD EPYC 7301 16-Core Processor Stepping: 2 CPU MHz: 2200.000 CPU max MHz: 2200.0000 CPU min MHz: 1200.0000 BogoMIPS: 4399.39 Virtualization: AMD-V L1d cache: 32K L1i cache: 64K L2 cache: 512K L3 cache: 8192K NUMA node0 CPU(s): 0-3,32-35 NUMA node1 CPU(s): 4-7,36-39 NUMA node2 CPU(s): 8-11,40-43 NUMA node3 CPU(s): 12-15,44-47 NUMA node4 CPU(s): 16-19,48-51 NUMA node5 CPU(s): 20-23,52-55 NUMA node6 CPU(s): 24-27,56-59 NUMA node7 CPU(s): 28-31,60-63 Flags: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt pdpe1gb rdtscp lm constant_tsc art rep_good nopl nonstop_tsc extd_apicid amd_dcm aperfmperf eagerfpu pni pclmulqdq monitor ssse3 fma cx16 sse4_1 sse4_2 movbe popcnt aes xsave avx f16c rdrand lahf_lm cmp_legacy svm extapic cr8_legacy abm sse4a misalignsse 3dnowprefetch osvw skinit wdt tce topoext perfctr_core perfctr_nb bpext perfctr_l2 cpb hw_pstate ssbd rsb_ctxsw ibpb vmmcall fsgsbase bmi1 avx2 smep bmi2 rdseed adx smap clflushopt sha_ni xsaveopt xsavec xgetbv1 clzero irperf xsaveerptr arat npt lbrv svm_lock nrip_save tsc_scale vmcb_clean flushbyasid decodeassists pausefilter pfthreshold avic v_vmsave_vmload vgif overflow_recov succor smca

@.*** ~]$ lscpu Architecture: x86_64 CPU op-mode(s): 32-bit, 64-bit Byte Order: Little Endian CPU(s): 256 On-line CPU(s) list: 0-255 Thread(s) per core: 2 Core(s) per socket: 64 Socket(s): 2 NUMA node(s): 2 Vendor ID: AuthenticAMD CPU family: 23 Model: 49 Model name: AMD EPYC 7742 64-Core Processor Stepping: 0 CPU MHz: 3269.104 BogoMIPS: 4491.48 Virtualization: AMD-V L1d cache: 32K L1i cache: 32K L2 cache: 512K L3 cache: 16384K NUMA node0 CPU(s): 0-63,128-191 NUMA node1 CPU(s): 64-127,192-255 Flags: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt pdpe1gb rdtscp lm constant_tsc rep_good nopl nonstop_tsc cpuid extd_apicid aperfmperf pni pclmulqdq monitor ssse3 fma cx16 sse4_1 sse4_2 x2apic movbe popcnt aes xsave avx f16c rdrand lahf_lm cmp_legacy svm extapic cr8_legacy abm sse4a misalignsse 3dnowprefetch osvw ibs skinit wdt tce topoext perfctr_core perfctr_nb bpext perfctr_llc mwaitx cpb cat_l3 cdp_l3 hw_pstate ssbd mba ibrs ibpb stibp vmmcall fsgsbase bmi1 avx2 smep bmi2 cqm rdt_a rdseed adx smap clflushopt clwb sha_ni xsaveopt xsavec xgetbv1 xsaves cqm_llc cqm_occup_llc cqm_mbm_total cqm_mbm_local clzero irperf xsaveerptr wbnoinvd arat npt lbrv svm_lock nrip_save tsc_scale vmcb_clean flushbyasid decodeassists pausefilter pfthreshold avic v_vmsave_vmload vgif umip rdpid overflow_recov succor smca


From: Brigit Murtaugh @.> Sent: Tuesday, March 28, 2023 9:54 AM To: microsoft/vscode-remote-release @.> Cc: Zhiyong Zhang @.>; Author @.> Subject: Re: [microsoft/vscode-remote-release] code tunnel doesn't see to start VSCode server (Issue #8289)

The same procedure worked on two of the other clusters I have tested and it worked just fine on those two clusters.

Thanks for filing. Can you share the difference between the cluster it isn't work on vs those that it is working on? i.e. are they different architectures, etc?

— Reply to this email directly, view it on GitHubhttps://github.com/microsoft/vscode-remote-release/issues/8289#issuecomment-1487281081, or unsubscribehttps://github.com/notifications/unsubscribe-auth/ADG474DKSOLWO4T3CZY6WFDW6MJUFANCNFSM6AAAAAAWKZSKIY. You are receiving this because you authored the thread.Message ID: @.***>

zyzhang1992 commented 1 year ago

I did another test on an AMD node on the cluster that I am having problem with and it didn't work either, same situation.

[zyzhang@sh03-ln01 login ~]$ lscpu Architecture: x86_64 CPU op-mode(s): 32-bit, 64-bit Byte Order: Little Endian CPU(s): 128 On-line CPU(s) list: 0-127 Thread(s) per core: 2 Core(s) per socket: 32 Socket(s): 2 NUMA node(s): 8 Vendor ID: AuthenticAMD CPU family: 23 Model: 49 Model name: AMD EPYC 7502 32-Core Processor Stepping: 0 CPU MHz: 2495.394 BogoMIPS: 4990.78 Virtualization: AMD-V L1d cache: 32K L1i cache: 32K L2 cache: 512K L3 cache: 16384K NUMA node0 CPU(s): 0-7,64-71 NUMA node1 CPU(s): 8-15,72-79 NUMA node2 CPU(s): 16-23,80-87 NUMA node3 CPU(s): 24-31,88-95 NUMA node4 CPU(s): 32-39,96-103 NUMA node5 CPU(s): 40-47,104-111 NUMA node6 CPU(s): 48-55,112-119 NUMA node7 CPU(s): 56-63,120-127 Flags: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt pdpe1gb rdtscp lm constant_tsc art rep_good nopl nonstop_tsc extd_apicid aperfmperf eagerfpu pni pclmulqdq monitor ssse3 fma cx16 sse4_1 sse4_2 x2apic movbe popcnt aes xsave avx f16c rdrand lahf_lm cmp_legacy svm extapic cr8_legacy abm sse4a misalignsse 3dnowprefetch osvw ibs skinit wdt tce topoext perfctr_core perfctr_nb bpext perfctr_l2 cpb cat_l3 cdp_l3 hw_pstate sme retpoline_amd ssbd ibrs ibpb stibp vmmcall fsgsbase bmi1 avx2 smep bmi2 cqm rdt_a rdseed adx smap clflushopt clwb sha_ni xsaveopt xsavec xgetbv1 cqm_llc cqm_occup_llc cqm_mbm_total cqm_mbm_local clzero irperf xsaveerptr arat npt lbrv svm_lock nrip_save tsc_scale vmcb_clean flushbyasid decodeassists pausefilter pfthreshold avic v_vmsave_vmload vgif umip overflow_recov succor smca

zyzhang1992 commented 1 year ago

I ran a strace of it and noticed the following,

brk(0x223a000) = 0x223a000 open("/home/users/zyzhang/.vscode-cli/code_tunnel.json", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory) getuid() = 35637 socket(AF_UNIX, SOCK_STREAM|SOCK_CLOEXEC|SOCK_NONBLOCK, 0) = 9 connect(9, {sa_family=AF_UNIX, sun_path="/run/user/35637/bus"}, 22) = -1 ENOENT (No such file or directory) close(9) = 0 open("/home/users/zyzhang/.vscode-cli/token.json", O_RDONLY|O_CLOEXEC) = 9 fcntl(9, F_SETFD, FD_CLOEXEC) = 0 fstat(9, {st_mode=S_IFREG|0644, st_size=142, ...}) = 0 lseek(9, 0, SEEK_CUR) = 0 read(9, "\"P0aUgAZAiyWGAghQVWg/RlYmrVtnfNo"..., 142) = 142 read(9, "", 32) = 0 close(9) = 0 uname({sysname="Linux", nodename="sh03-ln01.stanford.edu", ...}) = 0 readlink("/proc/self/exe", "/home/users/zyzhang/code", 256) = 24 brk(0x223b000) = 0x223b000 brk(0x223e000) = 0x223e000 futex(0x7f5d97828858, FUTEX_WAKE_PRIVATE, 1) = 1 futex(0x7f5d97f321b8, FUTEX_WAIT_PRIVATE, 1, NULL) = 0

connor4312 commented 1 year ago

From the logs you posted, it looks like there is some kind of network issue on the machine you're using

error sending request for url (https://global.rel.tunnels.api.visualstudio.com/api/v1/tunnels?global=true&tags=vscode-server-launcher&allTags=true): error trying to connect: tcp connect error: Operation timed out (os error 110)

or possibly DNS is not resolving that hostname correctly

zyzhang1992 commented 1 year ago

Thanks @connor4312 !

Looks like there is problem starting the vscode-server, hence that error of connection to the vscode server? Does that connection refer to the connection the server running on the cluster?

In fact, I don't see any signs of starting any other processes when running code tunnel.

Could that be an issue with communications between github and that cluster?

Is there a way not to use the github authentication when starting/connecting with the tunnel?

connor4312 commented 1 year ago

The VS Code server is not started until a remote editor connect to the tunnel, because before that point we don't know what server version is needed. https://global.rel.tunnels.api.visualstudio.com is the host that serves tunnel access; code tunnel is not functional if it's unavailable.

zyzhang1992 commented 1 year ago

Here is the output when it starts correctly. As can be seen there is also a request for starting a new connection https://usw3.rel.tunnels.api.visualstudio.com/ but when I click on that one, I also get the 404 error of 404 Not Found nginx.

When you mention "remote editor", did you mean the editor/vscode installed on my local machine? In that case, could there be configurations with my local vscode? I was assuming my local vscode should be fine since it worked with the other two clusters.

To resolve the host at https://global.rel.tunnels.api.visualstudio.com, is it the cluster on which I am running the code tunnel that will try to connect to it, possibly through the DNS service? With that assumption, what should I look for to trouble shoot for that?

Apologies for the ignorant questions. I may need to have a better understanding of the possible processes involved to be able to ask the right questions of people at our institution.

Open this link in your browser https://vscode.dev/tunnel/scg

[2023-03-28 15:08:54] debug [tunnels::connections::ws] sent liveness ping [2023-03-28 15:08:54] debug [tunnels::connections::ws] received liveness pong [2023-03-28 15:09:54] debug [tunnels::connections::ws] received liveness pong [2023-03-28 15:10:54] debug [tunnels::connections::ws] sent liveness ping [2023-03-28 15:10:54] debug [tunnels::connections::ws] received liveness pong

connor4312 commented 1 year ago

In that output, it looks like the tunnel was started up successfully.

When you mention "remote editor", did you mean the editor/vscode installed on my local machine? In that case, could there be configurations with my local vscode? I was assuming my local vscode should be fine since it worked with the other two clusters.

The VS Code server version must match the version of the client on the other end, so the VS Code server isn't downloaded until someone connect to the tunnel and tells the code tunnel process what version it needs.

zyzhang1992 commented 1 year ago

Thanks Connor. Yes in that case it was working fine. I included that as a comparison to the case which failed.

The difference is what happens after the following,

[2023-03-28 15:07:53] trace Found token in keyring [2023-03-28 15:07:53] debug [reqwest::connect] starting new connection: https://api.github.com/ [2023-03-28 15:07:53] debug [reqwest::connect] starting new connection: https://usw3.rel.tunnels.api.visualstudio.com/

In the case it failed, it just stuck here. As you pointed out, there may be DNS resolution issues. If I understand it correctly, it is the cluster I am on sending a connection request to https://usw3.rel.tunnels.api.visualstudio.com/ but couldn't connect to is. How do I trouble shoot this?

connor4312 commented 1 year ago

You aren't expected to open that URL directly. In the logs you posted, it showed the link you use to connect:

Open this link in your browser https://vscode.dev/tunnel/scg

What happens when you run the tunnel and go to that URL?

zyzhang1992 commented 1 year ago

When I opened that link I am at the vscode webpage interface to the code tunnel created at the scg machine,

This is the case when it worked. While on the other cluster when it doesn't work, the last I can see is

[2023-03-28 15:07:53] debug [reqwest::connect] starting new connection: https://usw3.rel.tunnels.api.visualstudio.com/

and it doesn't proceed any further from here until it fails explicitly.

zyzhang1992 commented 1 year ago

From the logs you posted, it looks like there is some kind of network issue on the machine you're using

error sending request for url (https://global.rel.tunnels.api.visualstudio.com/api/v1/tunnels?global=true&tags=vscode-server-launcher&allTags=true): error trying to connect: tcp connect error: Operation timed out (os error 110)

or possibly DNS is not resolving that hostname correctly

@connor4312 Is this what we should focus on to debug it? Is there any way to test the connection or the DNS in this situation?

connor4312 commented 1 year ago

Yea, I would first start by seeing if you can curl that URL from the affected machine. If it's accessible it'll give you a 401 lacking auth, but the CLI is failing before it gets to that step

zyzhang1992 commented 1 year ago

Thanks Connor.

I got the following

curl -isv 'https://global.rel.tunnels.api.visualstudio.com/api/v1/tunnels?global=true&tags=vscode-server-launcher&allTags=true'

It keeps hanging until tunnel times out

zyzhang1992 commented 1 year ago

What other tests can I do at this point?

zyzhang1992 commented 1 year ago

Here is what I have on the cluster where it failed:

[zyzhang@sh02-ln02 login ~/vscode-test]$ curl -isv 'https://global.rel.tunnels.api.visualstudio.com/api/v1/tunnels?global=true&tags=vscode-server-launcher&allTags=true'

On the cluster where it worked, the 1st few lines are:

curl -isv 'https://global.rel.tunnels.api.visualstudio.com/api/v1/tunnels?global=true&tags=vscode-server-launcher&allTags=true'

connor4312 commented 1 year ago

I'm not a networking in expert, and definitely not a networking expert for the environment you're running in. I would probably start by checking any firewall rules either on the machine or policies that might be applied to your network, e.g. in your cloud provider's console, if you use one.

It looks like this is not an issue on the VS Code side of things, so I will close this issue.

git-hub-asd commented 8 months ago

x

git-hub-asd commented 8 months ago

[``](,xl;,;l,asxsa\

)