cloudyr / googleComputeEngineR

An R interface to the Google Cloud Compute API, for launching virtual machines
https://cloudyr.github.io/googleComputeEngineR/
Other
152 stars 41 forks source link

port 22 is not open when running `gce_vm_cluster` #164

Open randy3k opened 4 years ago

randy3k commented 4 years ago

Describe the bug

port 22 is not open when running gce_vm_cluster. With the latest release version of googleComputeEngineR 0.3.0.

To Reproduce

r$> library(googleComputeEngineR)
Setting scopes to https://www.googleapis.com/auth/cloud-platform
Successfully auto-authenticated via service_account.json
Set default project ID to '<censored>'
Set default zone to 'us-central1-a'

r$> vms <- gce_vm_cluster()
2020-06-01 01:56:44> # Creating cluster with settings: template = r-base, dynamic_image = rocker/r-parallel, wait = FALSE, predefined_type = n1-standard-1
2020-06-01 01:56:51> Operation running...
2020-06-01 01:56:57> Operation complete in 7 secs
2020-06-01 01:57:00> Operation complete in 7 secs
2020-06-01 01:57:04> Operation complete in 6 secs
2020-06-01 01:57:05> r-cluster-1 VM running
2020-06-01 01:57:06> r-cluster-2 VM running
2020-06-01 01:57:08> r-cluster-3 VM running
2020-06-01 01:57:16> Public SSH key uploaded to instance
2020-06-01 01:57:24> Public SSH key uploaded to instance
2020-06-01 01:57:32> Public SSH key uploaded to instance
2020-06-01 01:57:32> # Testing cluster:
Error: port 22 is not open for 34.69.5.250

I am pretty sure the connection is open, when I ssh to it directly

(randyimac)-gce$ ssh 34.69.5.250
The authenticity of host '34.69.5.250 (34.69.5.250)' can't be established.
ED25519 key fingerprint is SHA256:i6cPMUTAmaKg0Jy2lS/m0JwKggJN3RnSSSNF/d5bd7g.
Are you sure you want to continue connecting (yes/no/[fingerprint])? yes
Warning: Permanently added '34.69.5.250' (ED25519) to the list of known hosts.
Randy@r-cluster-1 ~ $

Expected behavior

The command should run without error.

**Session Info

r$> sessionInfo()
R version 3.6.3 (2020-02-29)
Platform: x86_64-apple-darwin15.6.0 (64-bit)
Running under: macOS Catalina 10.15.4

Matrix products: default
BLAS:   /Library/Frameworks/R.framework/Versions/3.6/Resources/lib/libRblas.0.dylib
LAPACK: /Library/Frameworks/R.framework/Versions/3.6/Resources/lib/libRlapack.dylib

locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base

other attached packages:
[1] googleComputeEngineR_0.3.0

loaded via a namespace (and not attached):
 [1] codetools_0.2-16  listenv_0.8.0     future_1.16.0     digest_0.6.25
 [5] assertthat_0.2.1  R6_2.4.1          jsonlite_1.6.1    httr_1.4.1
 [9] rlang_0.4.6       curl_4.3          fs_1.4.1          googleAuthR_1.2.1
[13] tools_3.6.3       glue_1.4.1        parallel_3.6.3    compiler_3.6.3
[17] askpass_1.1       gargle_0.4.0.9004 globals_0.12.5    memoise_1.1.0
[21] openssl_1.4.1
MarkEdmondson1234 commented 4 years ago

It can sometimes take a moment for the ssh ports to recognise, does it work connecting to one of the VMs via gce_ssh() ? The logs indicate the ssh key upload was at least successful.

randy3k commented 4 years ago

Yes. gce_ssh works.

MarkEdmondson1234 commented 4 years ago

Ok then I think its the check that failed, but it is actually all working. I may need to put a longer pause in before the check. You should be able to send up parallel jobs etc using library(future)

randy3k commented 4 years ago

For some reasons, I got the following error

r$> plan(cluster, workers = vms)
bash: /usr/local/bin/docker: No such file or directory

I am trying to do this

vm1 <- gce_vm("r-cluster-1")
vm2 <- gce_vm("r-cluster-2")
vm3 <- gce_vm("r-cluster-3")

vms <- list(vm1, vm2, vm3)

# need this otherwise "check_ssh_set(x) is not TRUE"
vms <- lapply(vms, function(v) gce_ssh_setup(v, key.pub = "~/.ssh/id_rsa.pub"))

plan(cluster, workers = vms)
MarkEdmondson1234 commented 4 years ago

You shouldn't need gce_ssh_setup() anymore, that should be handled by gce_cluster()

If you run gce_vm_cluster() again with the same names, does it return? If the cluster is already up it should then just return the existing VM. Then you can use its returned vms

vms <- gce_vm_cluster()
plan(cluster, workers = as.cluster(vms))

Otherwise I think you can do it how you are building it, but need to wrap it in as.cluster(vms)

plan(cluster, workers = as.cluster(vms))

I refer to this documentation https://cloudyr.github.io/googleComputeEngineR/articles/massive-parallel.html - the website is most up to date

randy3k commented 4 years ago

Thanks. But I got this if I do not run gce_ssh_setup first.

> vms <- list(vm1, vm2, vm3)
> plan(cluster, workers = as.cluster(vms))
Error in as.cluster.gce_instance(X[[i]], ...) :
  check_ssh_set(x) is not TRUE
randy3k commented 4 years ago

Back to the bash: /usr/local/bin/docker: No such file or directory error,

it seems that it is an issue of the following line

> makeClusterPSOCK("34.71.11.230", rscript = c("docker", "run", "--net=host", "rocker/r-parallel", "Rscript"))
bash: /usr/local/bin/docker: No such file or directory

I have a docker installed on my system, and its path is /usr/local/bin/docker. It seems that makeClusterPSOCK was trying to resolve the path of docker https://github.com/HenrikBengtsson/future/blob/30a01ea4b3a922376549f054059325593163f917/R/makeClusterPSOCK.R#L505.

randy3k commented 4 years ago

Filed a bug at future https://github.com/HenrikBengtsson/future/issues/386

MarkEdmondson1234 commented 4 years ago

Thanks. But I got this if I do not run gce_ssh_setup first.

If setting up without gce_vm_cluster() then this is necessary, but if gce_vm_cluster() completes it does this step for you - I think it should complete the second time if using existing VM names and the ssh is completing manually.

I have a docker installed on my system, and its path is /usr/local/bin/docker.

The call should be calling docker on the VM which should have docker installed

randy3k commented 4 years ago

The call should be calling docker on the VM which should have docker installed

The issue has been fixed upstream.