cloudyr / googleComputeEngineR

An R interface to the Google Cloud Compute API, for launching virtual machines
https://cloudyr.github.io/googleComputeEngineR/
Other
152 stars 41 forks source link

Windows host file not updating when attempting SSH #137

Open MarkEdmondson1234 opened 5 years ago

MarkEdmondson1234 commented 5 years ago

Hi all, I've also been struggling with this issue while trying to set up a VM cluster using gce_vm_cluster and unfortunately the solutions suggested both here and in #32 have not been working.

Here is how my gce_vm_cluster is set up:

ssh <- list(
  "username" = "crayn",
  "ssh_overwrite" = FALSE,
  "key.pub" = file.path("C:","Users", "crayn", ".ssh", "id_rsa.pub"), 
  "key.private" = file.path("C:","Users", "crayn", ".ssh", "id_rsa")
)

vms <- gce_vm_cluster(
  vm_prefix = "attempt25-", 
  cluster_size = 1,
  docker_image = "rocker/r-parallel", 
  ssh_args = ssh,
  project = gce_get_global_project(), 
  zone = gce_get_global_zone()
)

I've gone through all the suggestions both here and in #32, but am getting the following error message:

2019-06-08 18:35:31> # Creating cluster with settings: template = r-base, dynamic_image = rocker/r-parallel, wait = FALSE, predefined_type = n1-standard-1
2019-06-08 18:35:34> Operation running...
2019-06-08 18:35:37> Operation running...
2019-06-08 18:35:43> Operation complete in 6 secs
2019-06-08 18:35:44> attempt24-1 VM running
2019-06-08 18:35:44> # Setting up SSH:username = crayn,ssh_overwrite = FALSE,key.pub = C:/Users/crayn/.ssh/id_rsa.pub,key.private = C:/Users/crayn/.ssh/id_rsa
2019-06-08 18:35:44> Using ssh-key files given as C:/Users/crayn/.ssh/id_rsa.pub / C:/Users/crayn/.ssh/id_rsa
2019-06-08 18:35:52> Public SSH key uploaded to instance
2019-06-08 18:35:52> # Testing cluster:
Failed to add the host to the list of known hosts ('C:\\Users\\crayn\\AppData\\Local\\Temp\\RtmpkP30Ts/hosts').
GetConsoleMode on STD_INPUT_HANDLE failed with 6

My session then hangs for several minutes until it times out, although it does creates the VM in my GCP Console.

After doing some investigation with trace, the problem seems to isolate to gce_ssh_addkeys > do_system > system2

It tried running the same arguments through system2 verbosely:

`test_cmd <- "ssh"`
`test_sargs <- c(
  "-v", 
  "-o BatchMode=yes -o StrictHostKeyChecking=no -o UserKnownHostsFile='C:\\Users\\crayn\\AppData\\Local\\Temp\\RtmpkP30Ts/hosts'", " -i ", 
  "'C:\\Users\\crayn\\.ssh\\id_rsa'", "crayn@35.203.124.94", "\"echo attempt25-1 ssh working\""
)
system2(test_cmd, args = test_sargs, wait = TRUE, stdout = "", stderr = "")

This produced the following readout:

OpenSSH_for_Windows_7.7p1, LibreSSL 2.6.5
debug1: Connecting to 35.203.124.94 [35.203.124.94] port 22.
debug1: Connection established.
debug1: identity file C:\\Users\\crayn/.ssh/id_rsa type 0
debug1: key_load_public: No such file or directory
debug1: identity file C:\\Users\\crayn/.ssh/id_rsa-cert type -1
debug1: key_load_public: No such file or directory
debug1: identity file C:\\Users\\crayn/.ssh/id_dsa type -1
debug1: key_load_public: No such file or directory
debug1: identity file C:\\Users\\crayn/.ssh/id_dsa-cert type -1
debug1: key_load_public: No such file or directory
debug1: identity file C:\\Users\\crayn/.ssh/id_ecdsa type -1
debug1: key_load_public: No such file or directory
debug1: identity file C:\\Users\\crayn/.ssh/id_ecdsa-cert type -1
debug1: key_load_public: No such file or directory
debug1: identity file C:\\Users\\crayn/.ssh/id_ed25519 type -1
debug1: key_load_public: No such file or directory
debug1: identity file C:\\Users\\crayn/.ssh/id_ed25519-cert type -1
debug1: key_load_public: No such file or directory
debug1: identity file C:\\Users\\crayn/.ssh/id_xmss type -1
debug1: key_load_public: No such file or directory
debug1: identity file C:\\Users\\crayn/.ssh/id_xmss-cert type -1
debug1: Local version string SSH-2.0-OpenSSH_for_Windows_7.7
debug1: Remote protocol version 2.0, remote software version OpenSSH_7.5
debug1: match: OpenSSH_7.5 pat OpenSSH* compat 0x04000000
debug1: Authenticating to 35.203.124.94:22 as 'crayn'
debug1: SSH2_MSG_KEXINIT sent
debug1: SSH2_MSG_KEXINIT received
debug1: kex: algorithm: curve25519-sha256
debug1: kex: host key algorithm: ssh-ed25519
debug1: kex: server->client cipher: chacha20-poly1305@openssh.com MAC: <implicit> compression: none
debug1: kex: client->server cipher: chacha20-poly1305@openssh.com MAC: <implicit> compression: one
debug1: expecting SSH2_MSG_KEX_ECDH_REPLY
debug1: Server host key: ssh-ed25519 SHA256:fJX2gVMR82gN6SlA2BCwZvvwmbt7JlRQc/Xa1+LK/fU
Failed to add the host to the list of known hosts ('C:\\Users\\crayn\\AppData\\Local\\Temp\\RtmpkP30Ts/hosts').
debug1: rekey after 134217728 blocks
debug1: SSH2_MSG_NEWKEYS sent
debug1: expecting SSH2_MSG_NEWKEYS
debug1: SSH2_MSG_NEWKEYS received
debug1: rekey after 134217728 blocks
debug1: SSH2_MSG_EXT_INFO received
debug1: kex_input_ext_info: server-sig-algs=<ssh-ed25519,ssh-rsa,rsa-sha2-256,rsa-sha2-512,ssh-dss,ecdsa-sha2-nistp256,ecdsa-sha2-nistp384,ecdsa-sha2-nistp521>
debug1: SSH2_MSG_SERVICE_ACCEPT received
debug1: Authentications that can continue: publickey
debug1: Next authentication method: publickey
debug1: Offering public key: RSA SHA256:R2VhkqCUoT7Clr2qCBH3dC6AFD/4eUJopH2q3szWWMY C:\\Users\\crayn\\.ssh\\id_rsa
debug1: Authentications that can continue: publickey
debug1: Offering public key: RSA SHA256:1GPFycnKhdNiG1HhPpxxc+fmY1oIkRSDF6Dfu7Bg5kk C:\\Users\\crayn/.ssh/id_rsa
debug1: Server accepts key: pkalg rsa-sha2-512 blen 279
debug1: Authentication succeeded (publickey).
Authenticated to 35.203.124.94 ([35.203.124.94]:22).
debug1: channel 0: new [client-session]
debug1: Requesting no-more-sessions@openssh.com
debug1: Entering interactive session.
debug1: pledge: network
debug1: client_input_global_request: rtype hostkeys-00@openssh.com want_reply 0
debug1: Sending command: echo attempt25-1 ssh working
debug1: client_input_channel_req: channel 0 rtype exit-status reply 0
GetConsoleMode on STD_INPUT_HANDLE failed with 6
debug1: client_input_channel_req: channel 0 rtype eow@openssh.com reply 0

It seems to me that the problem has something to do with writing to the "hosts" file. Do you have any suggestion to resolve this?

Originally posted by @camraynor in https://github.com/cloudyr/googleComputeEngineR/issues/35#issuecomment-500176570

MarkEdmondson1234 commented 5 years ago

I migrated this to a new issue, to make it easier to keep track.

I'm afraid I've not come across this before, which suggests its something unique to your setup - perhaps you don't have admin access to the HOST location its trying to write too? It hints at a deeper issue using system2 for SSH commands.

Nevertheless I would debug attempting to connect with just 1 VM, (gce_vm()) to avoid unnecessary costs.

Have you installed gcloud and set up SSH/authentication through that? If it works with one VM then it should be set up and ready for all.

MarkEdmondson1234 commented 5 years ago

Ping @camraynor

MarkEdmondson1234 commented 5 years ago

Isn't it perhaps something to do with file paths - C:\\Users\\crayn\\AppData\\Local\\Temp\\RtmpkP30Ts/hosts isn't a valid location on Windows?

Should be C:\\Users\\crayn\\AppData\\Local\\Temp\\RtmpkP30Ts\\HOSTS ?

camraynor commented 5 years ago

Thank you very much for your suggestions.

For the interim, I have found a solution in running RStudio in a Linux docker container that I've configured to install openshh-server and then use keys generated by ssh-keygen. That said, I would still like to get this working on my normal Windows set up.

To cross the easy one off the list, the formatting on the hosts file path doesn't seem to be the culprit since the error persists no matter how the path is formatted (the lingering forward slash at the end is created by assignInNamespace):

> test_sargs <- c(
+   # "-v",
+   "-o BatchMode=yes -o StrictHostKeyChecking=no -o UserKnownHostsFile='C:\\Users\\crayn\\AppData\\Local\\Temp\\RtmpkP30Ts\\hosts'",
+   " -i ",
+   "'C:\\Users\\crayn\\.ssh\\id_rsa'",
+   "crayn@35.203.124.94",
+   "\"echo attempt25-1 ssh working\""
+ )
> system2(test_cmd, args = test_sargs, wait = TRUE, stdout = "", stderr = "")
Failed to add the host to the list of known hosts ('C:\\Users\\crayn\\AppData\\Local\\Temp\\RtmpkP30Ts\\hosts').
GetConsoleMode on STD_INPUT_HANDLE failed with 6

> test_sargs <- c(
+   "-o BatchMode=yes -o StrictHostKeyChecking=no -o UserKnownHostsFile='C:\\Users\\crayn\\AppData\\Local\\Temp\\RtmpkP30Ts\\HOSTS'",
+   " -i ",
+   "'C:\\Users\\crayn\\.ssh\\id_rsa'",
+   "crayn@35.203.124.94",
+   "\"echo attempt25-1 ssh working\""
+ )
> system2(test_cmd, args = test_sargs, wait = TRUE, stdout = "", stderr = "")
Failed to add the host to the list of known hosts ('C:\\Users\\crayn\\AppData\\Local\\Temp\\RtmpkP30Ts\\HOSTS').
GetConsoleMode on STD_INPUT_HANDLE failed with 6

Additionally, permissions seem to be OK. I can create the directory and file through R using the dir.create and file.create commands respectively; however, this doesn't solve the Failed to add the host to the list of known hosts error. I also checked the file permissions using Powershell and it doesn't look like that's the issue either (the folder has identical permissions):

PS C:\Users\crayn\AppData\Local\Temp\RtmpkP30Ts> Get-Acl hosts

    Directory: C:\Users\crayn\AppData\Local\Temp\RtmpkP30Ts

Path  Owner     Access
----  -----     ------
hosts MSI\crayn NT AUTHORITY\SYSTEM Allow  FullControl...

I tried changing the permissions to my user (adapted from this post) but that didn't fix the problem either.

I'll take another shot at this later today. I'm wondering if the issue could have something to do with how Windows is interpreting or reading the key itself since the debug output contains debug1: Server host key: ssh-ed25519 SHA256:xw9ZSWkRt12SkGiVlwHTYFAOD/dClRSynuBMWJADU5M and it looks the RStudio key is RSA?

MarkEdmondson1234 commented 5 years ago

SSH is a pain. Could you see if you can connect ok using this library https://github.com/ropensci/ssh - if so this will hasten my port over to that to help smooth out these kind of issues.

papageorgiou commented 5 years ago

I am getting the same error message on windows, when running the gce_shiny_addapp function. Just like @camraynor I use the rstudio SSH keys and my session hangs after the error message. Next step is to look at the suggested ssh library / gcloud option

camraynor commented 5 years ago

I've now tried using the ssh package to generate the ssh key, however, I still get the same error message Failed to add the host to the list of known hosts ('C:\\Users\\crayn\\AppData\\Local\\Temp\\Rtmp67SsIh/hosts'). GetConsoleMode on STD_INPUT_HANDLE failed with 6. It seems to me that this suggests it might be a permissions issue rather than an ssh issue that's causing the problem. I'll keep investigating...

All that said, the ssh package does make it a lot easier to set up an ssh key pair than using RStudio's key pair or creating one with a tool like PuttyGen or OpenSSH.

camraynor commented 5 years ago

I should have mentioned, the ssh_keygen function doesn't write the public key to file, so I modified my session's gce_ssh_addkeys function to accept a character as key.pub, which then successfully uploaded the public ssh key to the instance before hitting the error.

camraynor commented 5 years ago

The host file issue disappears by removing the -o UserKnownHostsFile='C:\\Users\\crayn\\AppData\\Local\\Temp\\RtmpkP3fsa\\hosts' argument from the system2 args. With that change, I can see a message Found key in C:\\Users\\crayn/.ssh/known_hosts:1 when I run the system2 command verbosely. The authentication then fails with message debug1: client_input_channel_req: channel 0 rtype eow@openssh.com reply 0.

MarkEdmondson1234 commented 5 years ago

Ok so the hosts file is written in these lines:

https://github.com/cloudyr/googleComputeEngineR/blob/74272672d3a323958370c9852280f0a0720b8b8f/R/ssh_admin.R#L3-L19

Perhaps the temp file on your PC is unreadable - is there a better location on Windows to put it? Perhaps a recent Windows update changed temp file behaviour?

papageorgiou commented 5 years ago

In case this helps, I tried the code below in an old windows 7 machine and the keys are from Rstudio- it did throw an error, but there was no issue with the hosts file.

vm <- gce_ssh_setup(vm, key.pub = "C:/Users/Alex/.ssh/id_rsa.pub", key.private = "C:/Users/Alex/.ssh/id_rsa")

vm <- gce_ssh_setup(vm, key.pub = "C:/Users/Alex/.ssh/id_rsa.pub", key.private = "C:/Users/Alex/.ssh/id_rsa") 2019-06-14 20:45:14> Using ssh-key files given as C:/Users/Alex/.ssh/id_rsa.pub / C:/Users/Alex/.ssh/id_rsa 2019-06-14 20:45:23> Public SSH key uploaded to instance app_dir <- system.file("dockerfiles","shiny-googleAuthRdemo", package = "googleComputeEngineR") gce_shiny_addapp(vm, app_image = "gceshinydemo", dockerfolder = app_dir)

2019-06-14 20:45:44> Checked connection to 130.211.98.33 : status_code 200 Warning: Permanently added '130.211.98.33' (RSA) to the list of known hosts. /usr/bin/ssh: No such file or directory 2019-06-14 20:45:54> Error in gce_shiny_addapp(vm, app_image = "gceshinydemo", dockerfolder = app_dir) : Problem building image on instance

adamribaudo commented 4 years ago

I'm encountering this on Windows as well.

vm5 <- gce_vm(template = "rstudio", 
             name = "test-5",
             predefined_type = "f1-micro")

vm5 <- gce_ssh_setup(vm5, 
                    username = "adamr",
                    key.pub = "C:\\Users\\adamr\\.ssh\\id_rsa.pub",
                    key.private = "C:\\Users\\adamr\\.ssh\\id_rsa.pub")

gce_ssh(vm5, "echo foo")

Results in:

`Failed to add the host to the list of known hosts ('C:\\Users\\adamr\\AppData\\Local\\Temp\\RtmpEHFOs4/hosts').
GetConsoleMode on STD_INPUT_HANDLE failed with 6

foo

And then a subsequent hang. If I create my own 'hosts' file under tempdir() I get a TRUE response from file.exists()

> file.exists('C:\\Users\\adamr\\AppData\\Local\\Temp\\RtmpEHFOs4/hosts')
[1] TRUE

But that doesn't mean that an external tool like ssh.exe will work with that syntax.

I've tried modifying that file's permissions to provide write access to 'Everyone' with no luck.

Clearly the "echo" command is making its way to the VM because I get the resulting "foo" back, so I don't think it's an issue with the keys.

@MarkEdmondson1234 it could be that your use of tempdir() which handles file separation with "\\" is conflicting with file.path which uses "/" as the separator on Windows. This mix of formats works in R but I can imagine that ssh.exe might not like it.

Update I tried running ssh.exe directly with the UserKnownHostsFile options flag using every permutation of file separators ( \\, \, / ) I could think of but the "Failed to add the host to the list of known hosts" error is returned every time.

MarkEdmondson1234 commented 4 years ago

I'm not very familiar with Windows SSH setups, it looks like a setting difference from Windows 7 to modern Windows but I don't know how to begin to fix it. I have to throw this open to a benevolent Windows SSH expert.

MarkEdmondson1234 commented 4 years ago

Some troubleshooting links: https://serverfault.com/questions/452268/hosts-file-ignored-how-to-troubleshoot https://charlesr.co.uk/how-to-get-the-hosts-file-to-work-in-windows-10/

In all cases you should be able to copy the exact terminal call that googleComputeEngineR is doing by setting options(googleAuthR.verbose=2) and copy-pasting the shown system commands manually into your terminal. If someone can find an example that works manually, I can then move it across to the function.

camraynor commented 4 years ago

I was able to get around the issue in Windows by running RStudio in a local Docker container. It's a workaround, but it solves the problem for me.

adamribaudo commented 4 years ago

Looking into this further, I think we may be focused on the wrong issue. We've been focused on the "Failed to add the host to the list of known hosts" error but the "GetConsoleMode on STD_INPUT_HANDLE failed with 6" error seems more pressing as (I believe) it's describing an issue with accepting input from stdin and thus responsible for hanging the background process.

In other words, we can get by without a hosts file, but hanging the process presents more serious issues. As an example, I can't get through gce_vm_cluster() because it hangs when testing the SSH connection for the first VM.

Maybe this should split into 2 different issues?

grantmcdermott commented 4 years ago

I've been silently following this thread since a bunch of my students on Windows had trouble setting up a GCE cluster during class. I don't have a solution, but saw someone mention the new (built-in) OpenSSH client that's available on Windows 10. Maybe a way forward?

https://docs.microsoft.com/en-us/windows-server/administration/openssh/openssh_install_firstuse