googledatalab / datalab

Interactive tools and developer experiences for Big Data on Google Cloud Platform.
Apache License 2.0
974 stars 249 forks source link

How to copy data from datalab ssh instance to gs:// #2160

Open OrielResearchCure opened 4 years ago

OrielResearchCure commented 4 years ago

Hello all,

I have a connection issue that I was hoping to avoid bt creating a new machine with the old disk with no much success. Please let me know if you can detect the issue or has any other advice. my steps were (running from the cloud shell):

  1. delete the instance and keeping the disk.
  2. creating a new instance with the kept disk

For the new instance: datalab connect didnt work docker ps: CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES f5544a9595c3 gcr.io/cloud-datalab/datalab:latest "/datalab/run.sh" 57 seconds ago Up 56 seconds 127.0.0.1:8080->8080/tcp datalab b5e0d07633b9 gcr.io/google-containers/fluentd-gcp:2.0.17 "/bin/sh -c '/run.sh…" 45 minutes ago Up 45 minutes 80/tcp logger

cat /var/log/startupscript.log Feb 23 01:43:40 dl-20200222 startup-script[539]: INFO startup-script: useradd: warning: the home directory already exists. Feb 23 01:43:40 dl-20200222 startup-script[539]: INFO startup-script: Not copying any file from skel directory into it. Feb 23 01:43:40 dl-20200222 useradd[577]: new group: name=logger, GID=2001 Feb 23 01:43:40 dl-20200222 useradd[577]: new user: name=logger, UID=2001, GID=2001, home=/home/logger, shell=/bin/bash Feb 23 01:43:40 dl-20200222 startup-script[539]: INFO startup-script: useradd: warning: the home directory already exists. Feb 23 01:43:40 dl-20200222 startup-script[539]: INFO startup-script: Not copying any file from skel directory into it. Feb 23 01:43:40 dl-20200222 startup-script[539]: INFO startup-script: Getting Docker credentials Feb 23 01:43:43 dl-20200222 startup-script[539]: INFO startup-script: /home/datalab/.docker/config.json configured to use this credential helper for GCR registries Feb 23 01:43:43 dl-20200222 startup-script[539]: INFO startup-script: Pulling latest image: gcr.io/cloud-datalab/datalab:latest Feb 23 01:43:43 dl-20200222 startup-script[539]: INFO startup-script: latest: Pulling from cloud-datalab/datalab Feb 23 01:43:43 dl-20200222 startup-script[539]: INFO startup-script: Digest: sha256:d8826f5df792ddde9e152da08309edefc79b5ef30bf9232ef98c6772dffe6f7b Feb 23 01:43:43 dl-20200222 startup-script[539]: INFO startup-script: Status: Image is up to date for gcr.io/cloud-datalab/datalab:latest Feb 23 01:43:43 dl-20200222 startup-script[539]: INFO startup-script: gcr.io/cloud-datalab/datalab:latest Feb 23 01:43:43 dl-20200222 startup-script[539]: INFO startup-script: Trying to mount the persistent disk Feb 23 01:43:44 dl-20200222 startup-script[539]: INFO startup-script: Creating /mnt/disks/datalab-pd/content/datalab Feb 23 01:43:44 dl-20200222 startup-script[539]: INFO startup-script: Journal file /var/log/journal/3a1f7baf06dec3ffcbf19b2ae770da14/system@00059f344967b7bd-90e2c2550e1dba56.journal~ is truncated, ignoring file. Feb 23 01:43:44 dl-20200222 startup-script[539]: INFO startup-script: Return code 0. Feb 23 01:43:44 dl-20200222 startup-script[539]: INFO Finished running startup scripts. Feb 23 01:43:44 dl-20200222 systemd[1]: Started Google Compute Engine Startup Scripts. Feb 23 01:43:44 dl-20200222 systemd[1]: google-startup-scripts.service: Consumed 428ms CPU time -- Reboot -- Feb 23 02:07:28 dl-20200222 systemd[1]: Starting Google Compute Engine Startup Scripts... Feb 23 02:07:28 dl-20200222 startup-script[540]: INFO Starting startup scripts. Feb 23 02:07:28 dl-20200222 startup-script[540]: INFO Found startup-script in metadata. Feb 23 02:07:28 dl-20200222 startup-script[540]: INFO startup-script: useradd: user 'datalab' already exists Feb 23 02:07:28 dl-20200222 startup-script[540]: INFO startup-script: useradd: user 'datalab' already exists Feb 23 02:07:28 dl-20200222 startup-script[540]: INFO startup-script: useradd: user 'logger' already exists Feb 23 02:07:28 dl-20200222 startup-script[540]: INFO startup-script: useradd: user 'logger' already exists Feb 23 02:07:29 dl-20200222 startup-script[540]: INFO startup-script: Getting Docker credentials Feb 23 02:07:32 dl-20200222 startup-script[540]: INFO startup-script: /home/datalab/.docker/config.json configured to use this credential helper for GCR registries Feb 23 02:07:32 dl-20200222 startup-script[540]: INFO startup-script: Pulling latest image: gcr.io/cloud-datalab/datalab:latest Feb 23 02:07:32 dl-20200222 startup-script[540]: INFO startup-script: latest: Pulling from cloud-datalab/datalab Feb 23 02:07:32 dl-20200222 startup-script[540]: INFO startup-script: Digest: sha256:d8826f5df792ddde9e152da08309edefc79b5ef30bf9232ef98c6772dffe6f7b Feb 23 02:07:32 dl-20200222 startup-script[540]: INFO startup-script: Status: Image is up to date for gcr.io/cloud-datalab/datalab:latest Feb 23 02:07:32 dl-20200222 startup-script[540]: INFO startup-script: gcr.io/cloud-datalab/datalab:latest Feb 23 02:07:32 dl-20200222 startup-script[540]: INFO startup-script: Trying to mount the persistent disk Feb 23 02:07:33 dl-20200222 startup-script[540]: INFO startup-script: Creating /mnt/disks/datalab-pd/content/datalab Feb 23 02:07:33 dl-20200222 startup-script[540]: INFO startup-script: Journal file /var/log/journal/3a1f7baf06dec3ffcbf19b2ae770da14/system@00059f344967b7bd-90e2c2550e1dba56.journal~ is truncated, ignoring file.

I am not sure what the issue is. is it journal file? I am not familiar at all with this file. can I copy /mnt/disks/datalab-pd/content/datalab to gs://? I looked for gsutil at the PATH and couldn't find it. can i install without creating any conflict?

Please let me know if you know what is your advice to resolve this issue.

Many thanks, Eila