Closed rak5216 closed 3 years ago
The workflow runs fine with the TensorFlow 2.3. However, there are some issues with the metrics - they seem to be in a different order from one version to the other.
I am trying to compare the results from tensorflow 2.1 and 2.3. I created tf-debug config files for that purpose. However, we need to need to make sure the training set and starting points for training are the same.
Using encoder_weights: 'imagenet' raises an error regarding the input_shape:
Traceback (most recent call last):
File "train_segmentation_model.py", line 212, in <module> train(**argparser.parse_args().__dict__)
File "train_segmentation_model.py", line 94, in train train_config['optimizer'])
File "/home/cfurtado/necstlab-damage-segmentation/models.py", line 113, in generate_compiled_segmentation_model model = Unet(input_shape=(None, None, 1), classes=num_classes, **model_parameters)
File "/home/cfurtado/.local/lib/python3.5/site-packages/segmentation_models/__init__.py", line 34, in wrapper return func(*args, **kwargs)
File "/home/cfurtado/.local/lib/python3.5/site-packages/segmentation_models/models/unet.py", line 226, in Unet **kwargs,
File "/home/cfurtado/.local/lib/python3.5/site-packages/segmentation_models/backbones/backbones_factory.py", line 103, in get_backbone model = model_fn(*args, **kwargs)
File "/home/cfurtado/.local/lib/python3.5/site-packages/classification_models/models_factory.py", line 78, in wrapper return func(*args, **new_kwargs)
File "/home/cfurtado/.local/lib/python3.5/site-packages/keras_applications/vgg16.py", line 99, in VGG16 weights=weights)
File "/home/cfurtado/.local/lib/python3.5/site-packages/keras_applications/imagenet_utils.py", line 316, in _obtain_input_shape '`input_shape=' + str(input_shape) + '`')ValueError: The input must have 3 channels; got `input_shape=(None, None, 1)`
Josh's thoughts on that: "I believe channels here corresponds to image colors. it looks like the imagenet pretrained backbone is expecting 3 channels (colors) and we’re just using 1 (just gray rather than red/green/blue). So I’d start by looking into if that pretrained backbone actually can support 1 channel or not. And if not does that mean in it working in the past required us to use 3 channels?"
Reed - the default instantiation is input_shape=(None, None, 3), so we immediately violate that by setting None, None, 1. and imagenet is indeed full color, so i think we will just ignore since our input makes apple to oranges
So we will try a different approach to get models to start with the same initial weights.
We are now trying to work around it by allowing the training to be trained starting with the weights of a pretrained model (so, with fixed weights that are the outcome of training a model): Issue: Enable pre-training by initializing new model with previously trained weights #32
Running on cpu (to get repeatable results):
tf2.1 loss: 0.002412608 0.001155156 0.000974846 0.000862088
_valloss: 0.02849 0.002013 0.106169 0.002202
tf 2.3 loss: 0.002411151 0.001134973 0.000977466 0.000861835
_valloss: 0.022246 0.002309 0.005034 0.002559
Loss seems almost equal, not the val_loss though. Any thoughts @Josh-Joseph @rak5216 ?
@CarolinaFurtado can u check binary_ce_metric
too? it should match loss in train and val sets. also need to verify that tf 2.3 is repeatable on cpu. there's a chance that val loss is just more volatile than train loss, but ultimately, bottomline is tf 2.1 is not repeated by tf 2.3
@rak5216, for 2.3:
for 2.1
tf2.1 and 2.3 should not match each other here: different pretrained models
confirmed that 2.1 and 2.3 give slightly different results, even when ran on cpu
Fixed the metrics mosaic by removing loss
from metric_names
. Results don't match exactly because we don't have repeatability between 2.1 and 2.3.
2.3
2.1
dict_results = dict(zip(metric_names, all_results))
in models.py
was compiling two lists with different dimensions. meaning the selected optimizing_result
was wrong in the end. Removed loss
from metric_names
wrong! loss + binary_crossentropy + binary_ce_metric (off by one) {'loss': 0.0010814343113452196, 'binary_ce_metric': 3.68487532154127e-11, WRONG!!!! should all be the same, but since we are dict(zip(10 names, 9 metrics)), it ignores the last one, and the 3 first are binary cross entropy 'class0_f1_score': 0.29115191102027893, 'class0_binary_accuracy_sm': 0.9995417594909668, 'binary_crossentropy': 0.0010814343113452196, 'class0_precision': 0.33629000186920166, 'class0_binary_cross_entropy': 0.9995417594909668, 'class0_iou_score': 0.31209734082221985, - this is the optimal value. wrong here 'class0_binary_accuracy_tfkeras': 0.18490245938301086}
right! loss + binary_ce_metric {'class0_f1_score': 0.31209734082221985, 'class0_precision': 0.29115191102027893, 'class0_recall': 0.33629000186920166, 'loss': 0.0010814343113452196, 'class0_binary_cross_entropy': 3.68487532154127e-11, 'class0_binary_accuracy_tfkeras': 0.9995417594909668, 'binary_ce_metric': 0.0010814343113452196, 'class0_binary_accuracy_sm': 0.9995417594909668, 'class0_iou_score': 0.18490245938301086} - this is the optimal value. ok here
Same issue: removed loss
from metric_names
. Results match
warning when creating the VM:
DEPRECATION: Python 3.5 reached the end of its life on September 13th, 2020. Please upgrade your Python as Python 3.5 is no longer maintained. pip 21.0 will drop support for Python 3.5 in January 2021. pip 21.0 will remove support for this functionality.
current version: ubuntu 16.04 + cuda 10.1 + cudnn 7.65 ---- python 3.5 which is discontinued.
version trial 1: ubuntu 18.04 + cuda 10.1 + cudnn 7.65 ---- python 3.6. It works, but tf is not connecting to gpu
version trial 2: ubuntu 18.04 + cuda 11.0 + cudnn 8.0 ---- python 3.6. It works, but tf is not connecting to gpu
Other people have had simmilar problems when updating to ubunto 18.04: https://github.com/tensorflow/tensorflow/issues/43236
boot_disk { initialize_params { image = "projects/ubuntu-os-cloud/global/images/ubuntu-1804-bionic-v20201014" size = "${var.hard_drive_size_gp}" type = "pd-ssd" } }
sudo apt-get update
sudo apt-get install -y build-essential
wget https://developer.download.nvidia.com/compute/cuda/repos/ubuntu1804/x86_64/cuda-ubuntu1804.pin
sudo mv cuda-ubuntu1804.pin /etc/apt/preferences.d/cuda-repository-pin-600
wget http://developer.download.nvidia.com/compute/cuda/10.1/Prod/local_installers/cuda-repo-ubuntu1804-10-1-local-10.1.243-418.87.00_1.0-1_amd64.deb
sudo dpkg -i cuda-repo-ubuntu1804-10-1-local-10.1.243-418.87.00_1.0-1_amd64.deb
sudo apt-key add /var/cuda-repo-10-1-local-10.1.243-418.87.00/7fa2af80.pub
sudo apt-get update
sudo apt-get -y install nvidia-driver-418
sudo apt-get -y install cuda-10.1
export PATH=/usr/local/cuda-10.1/bin${PATH:+:${PATH}}
# install cudnn
wget http://developer.download.nvidia.com/compute/machine-learning/repos/ubuntu1804/x86_64/libcudnn7_7.6.5.32-1+cuda10.1_amd64.deb
sudo dpkg -i libcudnn7_7.6.5.32-1+cuda10.1_amd64.deb
wget http://developer.download.nvidia.com/compute/machine-learning/repos/ubuntu1804/x86_64/libcudnn7-dev_7.6.5.32-1+cuda10.1_amd64.deb
sudo dpkg -i libcudnn7-dev_7.6.5.32-1+cuda10.1_amd64.deb
google_compute_instance.vm[0]: Creating...
google_compute_instance.vm[0]: Still creating... [10s elapsed]
google_compute_instance.vm[0]: Still creating... [20s elapsed]
google_compute_instance.vm[0]: Still creating... [30s elapsed]
google_compute_instance.vm[0]: Provisioning with 'file'...
google_compute_instance.vm[0]: Still creating... [40s elapsed]
google_compute_instance.vm[0]: Still creating... [50s elapsed]
google_compute_instance.vm[0]: Still creating... [1m0s elapsed]
google_compute_instance.vm[0]: Still creating... [1m10s elapsed]
google_compute_instance.vm[0]: Provisioning with 'remote-exec'...
google_compute_instance.vm[0] (remote-exec): Connecting to remote host via SSH...
google_compute_instance.vm[0] (remote-exec): Host: 34.74.190.126
google_compute_instance.vm[0] (remote-exec): User: cfurtado
google_compute_instance.vm[0] (remote-exec): Password: false
google_compute_instance.vm[0] (remote-exec): Private key: true
google_compute_instance.vm[0] (remote-exec): Certificate: false
google_compute_instance.vm[0] (remote-exec): SSH Agent: false
google_compute_instance.vm[0] (remote-exec): Checking Host Key: false
google_compute_instance.vm[0] (remote-exec): Connected!
google_compute_instance.vm[0] (remote-exec): Running resource creation script... (this may take 10+ minutes)
google_compute_instance.vm[0] (remote-exec): W: GPG error: http://archive.ubuntu.com/ubuntu bionic InRelease: Splitting up /var/lib/apt/lists/archive.ubuntu.com_ubuntu_dists_bionic_InRelease into data and signature failed
google_compute_instance.vm[0] (remote-exec): E: The repository 'http://archive.ubuntu.com/ubuntu bionic InRelease' is not signed.
google_compute_instance.vm[0] (remote-exec): Extracting templates from packages: 75%
google_compute_instance.vm[0] (remote-exec): Extracting templates from packages: 100%
google_compute_instance.vm[0]: Still creating... [1m20s elapsed]
google_compute_instance.vm[0]: Still creating... [1m30s elapsed]
google_compute_instance.vm[0] (remote-exec): --2020-11-02 15:04:59-- https://developer.download.nvidia.com/compute/cuda/repos/ubuntu1804/x86_64/cuda-ubuntu1804.pin
google_compute_instance.vm[0] (remote-exec): Resolving developer.download.nvidia.com (developer.download.nvidia.com)... 152.195.19.142
google_compute_instance.vm[0] (remote-exec): Connecting to developer.download.nvidia.com (developer.download.nvidia.com)|152.195.19.142|:443... connected.
google_compute_instance.vm[0] (remote-exec): HTTP request sent, awaiting response... 200 OK
google_compute_instance.vm[0] (remote-exec): Length: 190 [application/octet-stream]
google_compute_instance.vm[0] (remote-exec): Saving to: ‘cuda-ubuntu1804.pin’
google_compute_instance.vm[0] (remote-exec): cuda- 0% 0 --.-KB/s
google_compute_instance.vm[0] (remote-exec): cuda-ubuntu 100% 190 --.-KB/s in 0s
google_compute_instance.vm[0] (remote-exec): 2020-11-02 15:04:59 (6.74 MB/s) - ‘cuda-ubuntu1804.pin’ saved [190/190]
google_compute_instance.vm[0] (remote-exec): --2020-11-02 15:04:59-- http://developer.download.nvidia.com/compute/cuda/10.1/Prod/local_installers/cuda-repo-ubuntu1804-10-1-local-10.1.243-418.87.00_1.0-1_amd64.deb
google_compute_instance.vm[0] (remote-exec): Resolving developer.download.nvidia.com (developer.download.nvidia.com)... 152.195.19.142
google_compute_instance.vm[0] (remote-exec): Connecting to developer.download.nvidia.com (developer.download.nvidia.com)|152.195.19.142|:80... connected.
google_compute_instance.vm[0] (remote-exec): HTTP request sent, awaiting response... 200 OK
google_compute_instance.vm[0] (remote-exec): Length: 1859785444 (1.7G) [application/x-deb]
google_compute_instance.vm[0] (remote-exec): Saving to: ‘cuda-repo-ubuntu1804-10-1-local-10.1.243-418.87.00_1.0-1_amd64.deb’
google_compute_instance.vm[0] (remote-exec): cuda- 0% 0 --.-KB/s
google_compute_instance.vm[0] (remote-exec): cuda-r 2% 37.40M 187MB/s
google_compute_instance.vm[0] (remote-exec): cuda-re 5% 94.23M 235MB/s
google_compute_instance.vm[0] (remote-exec): cuda-rep 8% 149.99M 250MB/s
google_compute_instance.vm[0] (remote-exec): cuda-repo 11% 205.94M 257MB/s
google_compute_instance.vm[0] (remote-exec): cuda-repo- 14% 259.92M 260MB/s
google_compute_instance.vm[0] (remote-exec): cuda-repo-u 17% 316.89M 264MB/s
google_compute_instance.vm[0] (remote-exec): uda-repo-ub 21% 373.72M 267MB/s
google_compute_instance.vm[0] (remote-exec): da-repo-ubu 24% 429.91M 269MB/s
google_compute_instance.vm[0] (remote-exec): a-repo-ubun 27% 484.08M 269MB/s
google_compute_instance.vm[0] (remote-exec): -repo-ubunt 30% 540.28M 270MB/s
google_compute_instance.vm[0] (remote-exec): repo-ubuntu 33% 596.67M 271MB/s
google_compute_instance.vm[0]: Still creating... [1m40s elapsed]
google_compute_instance.vm[0] (remote-exec): epo-ubuntu1 36% 651.08M 271MB/s
google_compute_instance.vm[0] (remote-exec): po-ubuntu18 39% 703.88M 271MB/s
google_compute_instance.vm[0] (remote-exec): o-ubuntu180 42% 759.34M 271MB/s
google_compute_instance.vm[0] (remote-exec): -ubuntu1804 45% 815.61M 272MB/s eta 4s
google_compute_instance.vm[0] (remote-exec): ubuntu1804- 49% 871.46M 278MB/s eta 4s
google_compute_instance.vm[0] (remote-exec): buntu1804-1 52% 927.98M 278MB/s eta 4s
google_compute_instance.vm[0] (remote-exec): untu1804-10 55% 984.70M 278MB/s eta 4s
google_compute_instance.vm[0] (remote-exec): ntu1804-10- 58% 1.02G 279MB/s eta 4s
google_compute_instance.vm[0] (remote-exec): tu1804-10-1 61% 1.07G 279MB/s eta 2s
google_compute_instance.vm[0] (remote-exec): u1804-10-1- 65% 1.13G 279MB/s eta 2s
google_compute_instance.vm[0] (remote-exec): 1804-10-1-l 68% 1.18G 279MB/s eta 2s
google_compute_instance.vm[0] (remote-exec): 804-10-1-lo 71% 1.24G 279MB/s eta 2s
google_compute_instance.vm[0] (remote-exec): 04-10-1-loc 74% 1.29G 279MB/s eta 2s
google_compute_instance.vm[0]: Still creating... [1m50s elapsed]
google_compute_instance.vm[0] (remote-exec): 4-10-1-loca 77% 1.35G 280MB/s eta 1s
google_compute_instance.vm[0] (remote-exec): -10-1-local 81% 1.40G 280MB/s eta 1s
google_compute_instance.vm[0] (remote-exec): 10-1-local- 84% 1.46G 280MB/s eta 1s
google_compute_instance.vm[0] (remote-exec): 0-1-local-1 87% 1.51G 281MB/s eta 1s
google_compute_instance.vm[0] (remote-exec): -1-local-10 90% 1.57G 282MB/s eta 1s
google_compute_instance.vm[0] (remote-exec): 1-local-10. 93% 1.62G 282MB/s eta 0s
google_compute_instance.vm[0] (remote-exec): -local-10.1 96% 1.68G 282MB/s eta 0s
google_compute_instance.vm[0] (remote-exec): cuda-repo-u 100% 1.73G 282MB/s in 6.4s
google_compute_instance.vm[0] (remote-exec): 2020-11-02 15:05:06 (277 MB/s) - ‘cuda-repo-ubuntu1804-10-1-local-10.1.243-418.87.00_1.0-1_amd64.deb’ saved [1859785444/1859785444]
google_compute_instance.vm[0] (remote-exec): Warning: apt-key output should not be parsed (stdout is not a terminal)
google_compute_instance.vm[0] (remote-exec): Extracting templates from packages: 20%
google_compute_instance.vm[0] (remote-exec): Extracting templates from packages: 41%
google_compute_instance.vm[0] (remote-exec): Extracting templates from packages: 62%
google_compute_instance.vm[0] (remote-exec): Extracting templates from packages: 83%
google_compute_instance.vm[0] (remote-exec): Extracting templates from packages: 100%
google_compute_instance.vm[0]: Still creating... [3m6s elapsed]
google_compute_instance.vm[0]: Still creating... [3m16s elapsed]
google_compute_instance.vm[0]: Still creating... [3m26s elapsed]
google_compute_instance.vm[0]: Still creating... [3m36s elapsed]
google_compute_instance.vm[0] (remote-exec): Extracting templates from packages: 25%
google_compute_instance.vm[0] (remote-exec): Extracting templates from packages: 50%
google_compute_instance.vm[0] (remote-exec): Extracting templates from packages: 75%
google_compute_instance.vm[0] (remote-exec): Extracting templates from packages: 100%
google_compute_instance.vm[0]: Still creating... [3m46s elapsed]
google_compute_instance.vm[0]: Still creating... [3m56s elapsed]
google_compute_instance.vm[0]: Still creating... [4m6s elapsed]
google_compute_instance.vm[0]: Still creating... [4m16s elapsed]
google_compute_instance.vm[0]: Still creating... [4m26s elapsed]
google_compute_instance.vm[0]: Still creating... [4m36s elapsed]
google_compute_instance.vm[0]: Still creating... [4m46s elapsed]
google_compute_instance.vm[0]: Still creating... [4m56s elapsed]
google_compute_instance.vm[0]: Still creating... [5m6s elapsed]
google_compute_instance.vm[0]: Still creating... [5m16s elapsed]
google_compute_instance.vm[0]: Still creating... [5m26s elapsed]
google_compute_instance.vm[0]: Still creating... [5m36s elapsed]
google_compute_instance.vm[0]: Still creating... [5m46s elapsed]
google_compute_instance.vm[0]: Still creating... [5m56s elapsed]
google_compute_instance.vm[0]: Still creating... [6m6s elapsed]
google_compute_instance.vm[0] (remote-exec): --2020-11-02 15:09:32-- http://developer.download.nvidia.com/compute/machine-learning/repos/ubuntu1804/x86_64/libcudnn7_7.6.5.32-1+cuda10.1_amd64.deb
google_compute_instance.vm[0] (remote-exec): Resolving developer.download.nvidia.com (developer.download.nvidia.com)... 152.195.19.142
google_compute_instance.vm[0] (remote-exec): Connecting to developer.download.nvidia.com (developer.download.nvidia.com)|152.195.19.142|:80... connected.
google_compute_instance.vm[0] (remote-exec): HTTP request sent, awaiting response... 200 OK
google_compute_instance.vm[0] (remote-exec): Length: 182313188 (174M) [application/x-deb]
google_compute_instance.vm[0] (remote-exec): Saving to: ‘libcudnn7_7.6.5.32-1+cuda10.1_amd64.deb’
google_compute_instance.vm[0] (remote-exec): libcu 0% 0 --.-KB/s
google_compute_instance.vm[0] (remote-exec): libcud 21% 37.99M 190MB/s
google_compute_instance.vm[0] (remote-exec): libcudn 54% 94.82M 237MB/s
google_compute_instance.vm[0] (remote-exec): libcudnn 87% 151.90M 253MB/s
google_compute_instance.vm[0] (remote-exec): libcudnn7_7 100% 173.87M 257MB/s in 0.7s
google_compute_instance.vm[0] (remote-exec): 2020-11-02 15:09:33 (257 MB/s) - ‘libcudnn7_7.6.5.32-1+cuda10.1_amd64.deb’ saved [182313188/182313188]
google_compute_instance.vm[0]: Still creating... [6m16s elapsed]
google_compute_instance.vm[0]: Still creating... [6m26s elapsed]
google_compute_instance.vm[0] (remote-exec): --2020-11-02 15:09:51-- http://developer.download.nvidia.com/compute/machine-learning/repos/ubuntu1804/x86_64/libcudnn7-dev_7.6.5.32-1+cuda10.1_amd64.deb
google_compute_instance.vm[0] (remote-exec): Resolving developer.download.nvidia.com (developer.download.nvidia.com)... 152.195.19.142
google_compute_instance.vm[0] (remote-exec): Connecting to developer.download.nvidia.com (developer.download.nvidia.com)|152.195.19.142|:80... connected.
google_compute_instance.vm[0] (remote-exec): HTTP request sent, awaiting response... 200 OK
google_compute_instance.vm[0] (remote-exec): Length: 160506208 (153M) [application/x-deb]
google_compute_instance.vm[0] (remote-exec): Saving to: ‘libcudnn7-dev_7.6.5.32-1+cuda10.1_amd64.deb’
google_compute_instance.vm[0] (remote-exec): libcu 0% 0 --.-KB/s
google_compute_instance.vm[0] (remote-exec): libcud 23% 35.31M 177MB/s
google_compute_instance.vm[0] (remote-exec): libcudn 60% 92.68M 232MB/s
google_compute_instance.vm[0] (remote-exec): libcudnn 97% 149.30M 249MB/s
google_compute_instance.vm[0] (remote-exec): libcudnn7-d 100% 153.07M 250MB/s in 0.6s
google_compute_instance.vm[0] (remote-exec): 2020-11-02 15:09:52 (250 MB/s) - ‘libcudnn7-dev_7.6.5.32-1+cuda10.1_amd64.deb’ saved [160506208/160506208]
google_compute_instance.vm[0]: Still creating... [6m36s elapsed]
google_compute_instance.vm[0]: Still creating... [6m46s elapsed]
google_compute_instance.vm[0] (remote-exec): Extracting templates from packages: 13%
google_compute_instance.vm[0] (remote-exec): Extracting templates from packages: 26%
google_compute_instance.vm[0] (remote-exec): Extracting templates from packages: 40%
google_compute_instance.vm[0] (remote-exec): Extracting templates from packages: 53%
google_compute_instance.vm[0] (remote-exec): Extracting templates from packages: 67%
google_compute_instance.vm[0] (remote-exec): Extracting templates from packages: 80%
google_compute_instance.vm[0] (remote-exec): Extracting templates from packages: 94%
google_compute_instance.vm[0] (remote-exec): Extracting templates from packages: 100%
google_compute_instance.vm[0]: Still creating... [7m8s elapsed]
google_compute_instance.vm[0]: Still creating... [7m18s elapsed]
google_compute_instance.vm[0]: Still creating... [7m28s elapsed]
google_compute_instance.vm[0] (remote-exec): WARNING: pip is being invoked by an old script wrapper. This will fail in a future version of pip.
google_compute_instance.vm[0] (remote-exec): Please see https://github.com/pypa/pip/issues/5599 for advice on fixing the underlying issue.
google_compute_instance.vm[0] (remote-exec): To avoid this problem you can invoke Python with '-m pip' instead of running pip directly.
google_compute_instance.vm[0]: Still creating... [7m38s elapsed]
google_compute_instance.vm[0] (remote-exec): WARNING: The scripts easy_install and easy_install-3.6 are installed in '/home/cfurtado/.local/bin' which is not on PATH.
google_compute_instance.vm[0] (remote-exec): Consider adding this directory to PATH or, if you prefer to suppress this warning, use --no-warn-script-location.
google_compute_instance.vm[0] (remote-exec): WARNING: pip is being invoked by an old script wrapper. This will fail in a future version of pip.
google_compute_instance.vm[0] (remote-exec): Please see https://github.com/pypa/pip/issues/5599 for advice on fixing the underlying issue.
google_compute_instance.vm[0] (remote-exec): To avoid this problem you can invoke Python with '-m pip' instead of running pip directly.
google_compute_instance.vm[0] (remote-exec): WARNING: Skipping crcmod as it is not installed.
google_compute_instance.vm[0] (remote-exec): WARNING: pip is being invoked by an old script wrapper. This will fail in a future version of pip.
google_compute_instance.vm[0] (remote-exec): Please see https://github.com/pypa/pip/issues/5599 for advice on fixing the underlying issue.
google_compute_instance.vm[0] (remote-exec): To avoid this problem you can invoke Python with '-m pip' instead of running pip directly.
google_compute_instance.vm[0] (remote-exec): WARNING: pip is being invoked by an old script wrapper. This will fail in a future version of pip.
google_compute_instance.vm[0] (remote-exec): Please see https://github.com/pypa/pip/issues/5599 for advice on fixing the underlying issue.
google_compute_instance.vm[0] (remote-exec): To avoid this problem you can invoke Python with '-m pip' instead of running pip directly.
google_compute_instance.vm[0] (remote-exec): WARNING: pip is being invoked by an old script wrapper. This will fail in a future version of pip.
google_compute_instance.vm[0] (remote-exec): Please see https://github.com/pypa/pip/issues/5599 for advice on fixing the underlying issue.
google_compute_instance.vm[0] (remote-exec): To avoid this problem you can invoke Python with '-m pip' instead of running pip directly.
google_compute_instance.vm[0]: Still creating... [7m48s elapsed]
google_compute_instance.vm[0]: Still creating... [7m58s elapsed]
google_compute_instance.vm[0]: Still creating... [8m8s elapsed]
google_compute_instance.vm[0] (remote-exec): WARNING: The scripts f2py, f2py3 and f2py3.6 are installed in '/home/cfurtado/.local/bin' which is not on PATH.
google_compute_instance.vm[0] (remote-exec): Consider adding this directory to PATH or, if you prefer to suppress this warning, use --no-warn-script-location.
google_compute_instance.vm[0] (remote-exec): WARNING: The script markdown_py is installed in '/home/cfurtado/.local/bin' which is not on PATH.
google_compute_instance.vm[0] (remote-exec): Consider adding this directory to PATH or, if you prefer to suppress this warning, use --no-warn-script-location.
google_compute_instance.vm[0] (remote-exec): WARNING: The scripts pyrsa-decrypt, pyrsa-encrypt, pyrsa-keygen, pyrsa-priv2pub, pyrsa-sign and pyrsa-verify are installed in '/home/cfurtado/.local/bin' which is not on PATH.
google_compute_instance.vm[0] (remote-exec): Consider adding this directory to PATH or, if you prefer to suppress this warning, use --no-warn-script-location.
google_compute_instance.vm[0] (remote-exec): WARNING: The script google-oauthlib-tool is installed in '/home/cfurtado/.local/bin' which is not on PATH.
google_compute_instance.vm[0] (remote-exec): Consider adding this directory to PATH or, if you prefer to suppress this warning, use --no-warn-script-location.
google_compute_instance.vm[0] (remote-exec): WARNING: The script tensorboard is installed in '/home/cfurtado/.local/bin' which is not on PATH.
google_compute_instance.vm[0] (remote-exec): Consider adding this directory to PATH or, if you prefer to suppress this warning, use --no-warn-script-location.
google_compute_instance.vm[0] (remote-exec): WARNING: The scripts estimator_ckpt_converter, saved_model_cli, tensorboard, tf_upgrade_v2, tflite_convert, toco and toco_from_protos are installed in '/home/cfurtado/.local/bin' which is not on PATH.
google_compute_instance.vm[0] (remote-exec): Consider adding this directory to PATH or, if you prefer to suppress this warning, use --no-warn-script-location.
google_compute_instance.vm[0] (remote-exec): WARNING: The scripts imageio_download_bin and imageio_remove_bin are installed in '/home/cfurtado/.local/bin' which is not on PATH.
google_compute_instance.vm[0] (remote-exec): Consider adding this directory to PATH or, if you prefer to suppress this warning, use --no-warn-script-location.
google_compute_instance.vm[0] (remote-exec): WARNING: The scripts lsm2bin and tifffile are installed in '/home/cfurtado/.local/bin' which is not on PATH.
google_compute_instance.vm[0] (remote-exec): Consider adding this directory to PATH or, if you prefer to suppress this warning, use --no-warn-script-location.
google_compute_instance.vm[0] (remote-exec): WARNING: The script skivi is installed in '/home/cfurtado/.local/bin' which is not on PATH.
google_compute_instance.vm[0] (remote-exec): Consider adding this directory to PATH or, if you prefer to suppress this warning, use --no-warn-script-location.
google_compute_instance.vm[0] (remote-exec): WARNING: The script pygmentize is installed in '/home/cfurtado/.local/bin' which is not on PATH.
google_compute_instance.vm[0] (remote-exec): Consider adding this directory to PATH or, if you prefer to suppress this warning, use --no-warn-script-location.
google_compute_instance.vm[0] (remote-exec): WARNING: The scripts iptest, iptest3, ipython and ipython3 are installed in '/home/cfurtado/.local/bin' which is not on PATH.
google_compute_instance.vm[0]: Still creating... [8m18s elapsed]
google_compute_instance.vm[0] (remote-exec): Consider adding this directory to PATH or, if you prefer to suppress this warning, use --no-warn-script-location.
google_compute_instance.vm[0] (remote-exec): WARNING: The scripts jupyter, jupyter-migrate and jupyter-troubleshoot are installed in '/home/cfurtado/.local/bin' which is not on PATH.
google_compute_instance.vm[0] (remote-exec): Consider adding this directory to PATH or, if you prefer to suppress this warning, use --no-warn-script-location.
google_compute_instance.vm[0] (remote-exec): WARNING: The scripts jupyter-kernel, jupyter-kernelspec and jupyter-run are installed in '/home/cfurtado/.local/bin' which is not on PATH.
google_compute_instance.vm[0] (remote-exec): Consider adding this directory to PATH or, if you prefer to suppress this warning, use --no-warn-script-location.
google_compute_instance.vm[0] (remote-exec): WARNING: The script jupyter-console is installed in '/home/cfurtado/.local/bin' which is not on PATH.
google_compute_instance.vm[0] (remote-exec): Consider adding this directory to PATH or, if you prefer to suppress this warning, use --no-warn-script-location.
google_compute_instance.vm[0] (remote-exec): WARNING: The script jupyter-trust is installed in '/home/cfurtado/.local/bin' which is not on PATH.
google_compute_instance.vm[0] (remote-exec): Consider adding this directory to PATH or, if you prefer to suppress this warning, use --no-warn-script-location.
google_compute_instance.vm[0] (remote-exec): WARNING: The script jupyter-nbconvert is installed in '/home/cfurtado/.local/bin' which is not on PATH.
google_compute_instance.vm[0] (remote-exec): Consider adding this directory to PATH or, if you prefer to suppress this warning, use --no-warn-script-location.
google_compute_instance.vm[0] (remote-exec): WARNING: The scripts jupyter-bundlerextension, jupyter-nbextension, jupyter-notebook and jupyter-serverextension are installed in '/home/cfurtado/.local/bin' which is not on PATH.
google_compute_instance.vm[0] (remote-exec): Consider adding this directory to PATH or, if you prefer to suppress this warning, use --no-warn-script-location.
google_compute_instance.vm[0] (remote-exec): ERROR: After October 2020 you may experience errors when installing or updating packages. This is because pip will change the way that it resolves dependency conflicts.
google_compute_instance.vm[0] (remote-exec): We recommend you use --use-feature=2020-resolver to test your packages with the new resolver before it becomes the default.
google_compute_instance.vm[0] (remote-exec): tensorflow-gpu 2.3.1 requires numpy<1.19.0,>=1.16.0, but you'll have numpy 1.19.3 which is incompatible.
google_compute_instance.vm[0]: Provisioning with 'remote-exec'...
google_compute_instance.vm[0] (remote-exec): Connecting to remote host via SSH...
google_compute_instance.vm[0] (remote-exec): Host: 34.74.190.126
google_compute_instance.vm[0] (remote-exec): User: cfurtado
google_compute_instance.vm[0] (remote-exec): Password: false
google_compute_instance.vm[0] (remote-exec): Private key: true
google_compute_instance.vm[0] (remote-exec): Certificate: false
google_compute_instance.vm[0] (remote-exec): SSH Agent: false
google_compute_instance.vm[0] (remote-exec): Checking Host Key: false
google_compute_instance.vm[0] (remote-exec): Connected!
Operation completed over 43 objects/273.8 MiB.
2020-11-02 15:34:55.974295: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcuda.so.1
2020-11-02 15:34:56.007081: E tensorflow/stream_executor/cuda/cuda_driver.cc:314] failed call to cuInit: CUDA_ERROR_NO_DEVICE: no CUDA-capable device is detected
2020-11-02 15:34:56.007139: I tensorflow/stream_executor/cuda/cuda_diagnostics.cc:156] kernel driver does not appear to be running on this host (cfurtado-necstlab-0): /proc/driver/nvidia/version does not exist
2020-11-02 15:34:56.007862: I tensorflow/core/platform/cpu_feature_guard.cc:142] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN)to use the following CPU instructions in performance-critical operations: AVX2 FMA
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
2020-11-02 15:34:56.037040: I tensorflow/core/platform/profile_utils/cpu_utils.cc:104] CPU Frequency: 2300000000 Hz
2020-11-02 15:34:56.037485: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x4a4ab60 initialized for platform Host (this does not guarantee that XLA will be used). Devices:
2020-11-02 15:34:56.037552: I tensorflow/compiler/xla/service/service.cc:176] StreamExecutor device (0): Host, Default Version
Model: "functional_1"
$nvidia-smi
NVIDIA-SMI has failed because it couldn't communicate with the NVIDIA driver. Make sure that the latest NVIDIA driver is installed and running.
cuda version - ok
$ /usr/local/cuda/bin/nvcc --version
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2019 NVIDIA Corporation
Built on Sun_Jul_28_19:07:16_PDT_2019
Cuda compilation tools, release 10.1, V10.1.243
ubunto version - ok
$lsb_release -a
No LSB modules are available.
Distributor ID: Ubuntu
Description: Ubuntu 18.04.5 LTS
Release: 18.04
Codename: bionic
python version
$ python3 --version
Python 3.6.9
cudnn version - ok 7.6.5:
$ cat /usr/include/x86_64-linux-gnu/cudnn_v*.h | grep CUDNN_MAJOR -A 2
#define CUDNN_MAJOR 7
#define CUDNN_MINOR 6
#define CUDNN_PATCHLEVEL 5
--
#define CUDNN_VERSION (CUDNN_MAJOR * 1000 + CUDNN_MINOR * 100 + CUDNN_PATCHLEVEL)
#include "driver_types.h"
boot_disk {
initialize_params {
image = "projects/ubuntu-os-cloud/global/images/ubuntu-2004-focal-v20201028"
size = "${var.hard_drive_size_gp}"
type = "pd-ssd"
}
}
sudo apt-get update
sudo apt-get install -y build-essential
wget https://developer.download.nvidia.com/compute/cuda/repos/ubuntu1804/x86_64/cuda-ubuntu1804.pin
sudo mv cuda-ubuntu1804.pin /etc/apt/preferences.d/cuda-repository-pin-600
wget http://developer.download.nvidia.com/compute/cuda/10.1/Prod/local_installers/cuda-repo-ubuntu1804-10-1-local-10.1.243-418.87.00_1.0-1_amd64.deb
sudo dpkg -i cuda-repo-ubuntu1804-10-1-local-10.1.243-418.87.00_1.0-1_amd64.deb
sudo apt-key add /var/cuda-repo-10-1-local-10.1.243-418.87.00/7fa2af80.pub
sudo apt-get update
sudo apt-get -y install cuda-10.1
export PATH=/usr/local/cuda-10.1/bin${PATH:+:${PATH}}
# install cudnn
wget http://developer.download.nvidia.com/compute/machine-learning/repos/ubuntu1804/x86_64/libcudnn7_7.6.5.32-1+cuda10.1_amd64.deb
sudo dpkg -i libcudnn7_7.6.5.32-1+cuda10.1_amd64.deb
wget http://developer.download.nvidia.com/compute/machine-learning/repos/ubuntu1804/x86_64/libcudnn7-dev_7.6.5.32-1+cuda10.1_amd64.deb
sudo dpkg -i libcudnn7-dev_7.6.5.32-1+cuda10.1_amd64.deb
google_compute_instance.vm[1]: Creating...
google_compute_instance.vm[1]: Still creating... [10s elapsed]
google_compute_instance.vm[1]: Still creating... [20s elapsed]
google_compute_instance.vm[1]: Still creating... [30s elapsed]
google_compute_instance.vm[1]: Provisioning with 'file'...
google_compute_instance.vm[1]: Still creating... [40s elapsed]
google_compute_instance.vm[1]: Still creating... [50s elapsed]
google_compute_instance.vm[1]: Still creating... [1m0s elapsed]
google_compute_instance.vm[1]: Provisioning with 'remote-exec'...
google_compute_instance.vm[1] (remote-exec): Connecting to remote host via SSH...
google_compute_instance.vm[1] (remote-exec): Host: 34.74.110.244
google_compute_instance.vm[1] (remote-exec): User: cfurtado
google_compute_instance.vm[1] (remote-exec): Password: false
google_compute_instance.vm[1] (remote-exec): Private key: true
google_compute_instance.vm[1] (remote-exec): Certificate: false
google_compute_instance.vm[1] (remote-exec): SSH Agent: false
google_compute_instance.vm[1] (remote-exec): Checking Host Key: false
google_compute_instance.vm[1] (remote-exec): Connected!
google_compute_instance.vm[1] (remote-exec): Running resource creation script... (this may take 10+ minutes)
google_compute_instance.vm[1]: Still creating... [1m10s elapsed]
google_compute_instance.vm[1] (remote-exec): Extracting templates from packages: 73%
google_compute_instance.vm[1] (remote-exec): Extracting templates from packages: 100%
google_compute_instance.vm[1]: Still creating... [1m20s elapsed]
google_compute_instance.vm[1]: Still creating... [1m30s elapsed]
google_compute_instance.vm[1] (remote-exec): --2020-11-02 17:26:59-- https://developer.download.nvidia.com/compute/cuda/repos/ubuntu1804/x86_64/cuda-ubuntu1804.pin
google_compute_instance.vm[1] (remote-exec): Resolving developer.download.nvidia.com (developer.download.nvidia.com)... 152.195.19.142
google_compute_instance.vm[1] (remote-exec): Connecting to developer.download.nvidia.com (developer.download.nvidia.com)|152.195.19.142|:443... connected.
google_compute_instance.vm[1] (remote-exec): HTTP request sent, awaiting response... 200 OK
google_compute_instance.vm[1] (remote-exec): Length: 190 [application/octet-stream]
google_compute_instance.vm[1] (remote-exec): Saving to: ‘cuda-ubuntu1804.pin’
google_compute_instance.vm[1] (remote-exec): cuda- 0% 0 --.-KB/s
google_compute_instance.vm[1] (remote-exec): cuda-ubuntu 100% 190 --.-KB/s in 0s
google_compute_instance.vm[1] (remote-exec): 2020-11-02 17:26:59 (7.35 MB/s) - ‘cuda-ubuntu1804.pin’ saved [190/190]
google_compute_instance.vm[1] (remote-exec): --2020-11-02 17:27:00-- http://developer.download.nvidia.com/compute/cuda/10.1/Prod/local_installers/cuda-repo-ubuntu1804-10-1-local-10.1.243-418.87.00_1.0-1_amd64.deb
google_compute_instance.vm[1] (remote-exec): Resolving developer.download.nvidia.com (developer.download.nvidia.com)... 152.195.19.142
google_compute_instance.vm[1] (remote-exec): Connecting to developer.download.nvidia.com (developer.download.nvidia.com)|152.195.19.142|:80... connected.
google_compute_instance.vm[1] (remote-exec): HTTP request sent, awaiting response... 200 OK
google_compute_instance.vm[1] (remote-exec): Length: 1859785444 (1.7G) [application/x-deb]
google_compute_instance.vm[1] (remote-exec): Saving to: ‘cuda-repo-ubuntu1804-10-1-local-10.1.243-418.87.00_1.0-1_amd64.deb’
google_compute_instance.vm[1] (remote-exec): cuda- 0% 0 --.-KB/s
google_compute_instance.vm[1] (remote-exec): cuda-r 2% 38.98M 195MB/s
google_compute_instance.vm[1] (remote-exec): cuda-re 5% 96.70M 242MB/s
google_compute_instance.vm[1] (remote-exec): cuda-rep 8% 154.87M 258MB/s
google_compute_instance.vm[1] (remote-exec): cuda-repo 11% 212.24M 265MB/s
google_compute_instance.vm[1] (remote-exec): cuda-repo- 15% 268.35M 268MB/s
google_compute_instance.vm[1] (remote-exec): cuda-repo-u 18% 321.65M 268MB/s
google_compute_instance.vm[1] (remote-exec): uda-repo-ub 21% 379.05M 271MB/s
google_compute_instance.vm[1] (remote-exec): da-repo-ubu 24% 436.84M 273MB/s
google_compute_instance.vm[1] (remote-exec): a-repo-ubun 27% 494.44M 275MB/s
google_compute_instance.vm[1] (remote-exec): -repo-ubunt 31% 550.39M 275MB/s
google_compute_instance.vm[1] (remote-exec): repo-ubuntu 34% 608.26M 276MB/s
google_compute_instance.vm[1] (remote-exec): epo-ubuntu1 37% 665.78M 277MB/s
google_compute_instance.vm[1]: Still creating... [1m40s elapsed]
google_compute_instance.vm[1] (remote-exec): po-ubuntu18 40% 723.61M 278MB/s
google_compute_instance.vm[1] (remote-exec): o-ubuntu180 44% 780.44M 279MB/s
google_compute_instance.vm[1] (remote-exec): -ubuntu1804 47% 837.91M 279MB/s eta 3s
google_compute_instance.vm[1] (remote-exec): ubuntu1804- 50% 895.65M 286MB/s eta 3s
google_compute_instance.vm[1] (remote-exec): buntu1804-1 53% 953.15M 285MB/s eta 3s
google_compute_instance.vm[1] (remote-exec): untu1804-10 57% 1011M 285MB/s eta 3s
google_compute_instance.vm[1] (remote-exec): ntu1804-10- 60% 1.04G 286MB/s eta 3s
google_compute_instance.vm[1] (remote-exec): tu1804-10-1 63% 1.10G 286MB/s eta 2s
google_compute_instance.vm[1] (remote-exec): u1804-10-1- 66% 1.16G 287MB/s eta 2s
google_compute_instance.vm[1] (remote-exec): 1804-10-1-l 69% 1.21G 287MB/s eta 2s
google_compute_instance.vm[1] (remote-exec): 804-10-1-lo 73% 1.27G 287MB/s eta 2s
google_compute_instance.vm[1] (remote-exec): 04-10-1-loc 76% 1.32G 287MB/s eta 2s
google_compute_instance.vm[1] (remote-exec): 4-10-1-loca 79% 1.38G 287MB/s eta 1s
google_compute_instance.vm[1] (remote-exec): -10-1-local 82% 1.43G 287MB/s eta 1s
google_compute_instance.vm[1] (remote-exec): 10-1-local- 85% 1.49G 285MB/s eta 1s
google_compute_instance.vm[1] (remote-exec): 0-1-local-1 88% 1.54G 284MB/s eta 1s
google_compute_instance.vm[1] (remote-exec): -1-local-10 91% 1.59G 284MB/s eta 1s
google_compute_instance.vm[1] (remote-exec): 1-local-10. 95% 1.65G 283MB/s eta 0s
google_compute_instance.vm[1] (remote-exec): -local-10.1 98% 1.70G 283MB/s eta 0s
google_compute_instance.vm[1] (remote-exec): cuda-repo-u 100% 1.73G 283MB/s in 6.3s
google_compute_instance.vm[1] (remote-exec): 2020-11-02 17:27:06 (281 MB/s) - ‘cuda-repo-ubuntu1804-10-1-local-10.1.243-418.87.00_1.0-1_amd64.deb’ saved [1859785444/1859785444]
google_compute_instance.vm[1]: Still creating... [1m50s elapsed]
google_compute_instance.vm[1] (remote-exec): Warning: apt-key output should not be parsed (stdout is not a terminal)
google_compute_instance.vm[1]: Still creating... [2m0s elapsed]
google_compute_instance.vm[1]: Still creating... [2m10s elapsed]
google_compute_instance.vm[1] (remote-exec): Extracting templates from packages: 4%
google_compute_instance.vm[1] (remote-exec): Extracting templates from packages: 9%
google_compute_instance.vm[1] (remote-exec): Extracting templates from packages: 14%
google_compute_instance.vm[1] (remote-exec): Extracting templates from packages: 19%
google_compute_instance.vm[1] (remote-exec): Extracting templates from packages: 24%
google_compute_instance.vm[1] (remote-exec): Extracting templates from packages: 29%
google_compute_instance.vm[1] (remote-exec): Extracting templates from packages: 34%
google_compute_instance.vm[1] (remote-exec): Extracting templates from packages: 39%
google_compute_instance.vm[1] (remote-exec): Extracting templates from packages: 44%
google_compute_instance.vm[1] (remote-exec): Extracting templates from packages: 49%
google_compute_instance.vm[1] (remote-exec): Extracting templates from packages: 54%
google_compute_instance.vm[1] (remote-exec): Extracting templates from packages: 59%
google_compute_instance.vm[1] (remote-exec): Extracting templates from packages: 63%
google_compute_instance.vm[1] (remote-exec): Extracting templates from packages: 68%
google_compute_instance.vm[1] (remote-exec): Extracting templates from packages: 73%
google_compute_instance.vm[1] (remote-exec): Extracting templates from packages: 78%
google_compute_instance.vm[1] (remote-exec): Extracting templates from packages: 83%
google_compute_instance.vm[1] (remote-exec): Extracting templates from packages: 88%
google_compute_instance.vm[1] (remote-exec): Extracting templates from packages: 93%
google_compute_instance.vm[1] (remote-exec): Extracting templates from packages: 98%
google_compute_instance.vm[1] (remote-exec): Extracting templates from packages: 100%
google_compute_instance.vm[1]: Still creating... [2m20s elapsed]
google_compute_instance.vm[1]: Still creating... [2m30s elapsed]
google_compute_instance.vm[1]: Still creating... [2m40s elapsed]
google_compute_instance.vm[1]: Still creating... [2m50s elapsed]
google_compute_instance.vm[1]: Still creating... [3m0s elapsed]
google_compute_instance.vm[1]: Still creating... [3m10s elapsed]
google_compute_instance.vm[1]: Still creating... [3m20s elapsed]
google_compute_instance.vm[1]: Still creating... [3m30s elapsed]
google_compute_instance.vm[1]: Still creating... [3m40s elapsed]
google_compute_instance.vm[1]: Still creating... [3m50s elapsed]
google_compute_instance.vm[1]: Still creating... [4m0s elapsed]
google_compute_instance.vm[1]: Still creating... [4m10s elapsed]
google_compute_instance.vm[1]: Still creating... [4m20s elapsed]
google_compute_instance.vm[1]: Still creating... [4m30s elapsed]
google_compute_instance.vm[1]: Still creating... [4m40s elapsed]
google_compute_instance.vm[1]: Still creating... [4m50s elapsed]
google_compute_instance.vm[1]: Still creating... [5m0s elapsed]
google_compute_instance.vm[1]: Still creating... [5m10s elapsed]
google_compute_instance.vm[1]: Still creating... [5m20s elapsed]
google_compute_instance.vm[1]: Still creating... [5m30s elapsed]
google_compute_instance.vm[1]: Still creating... [5m40s elapsed]
google_compute_instance.vm[1]: Still creating... [5m50s elapsed]
google_compute_instance.vm[1] (remote-exec): No apport report written because the error message indicates its a followup error from a previous failure.
google_compute_instance.vm[1] (remote-exec): No apport report written because the error message indicates its a followup error from a previous failure.
google_compute_instance.vm[1] (remote-exec): No apport report written because MaxReports is reached already
google_compute_instance.vm[1]: Still creating... [6m0s elapsed]
google_compute_instance.vm[1] (remote-exec): No apport report written because MaxReports is reached already
google_compute_instance.vm[1] (remote-exec): No apport report written because MaxReports is reached already
google_compute_instance.vm[1]: Still creating... [6m10s elapsed]
google_compute_instance.vm[1]: Still creating... [6m20s elapsed]
google_compute_instance.vm[1]: Still creating... [6m30s elapsed]
google_compute_instance.vm[1] (remote-exec): E: Sub-process /usr/bin/dpkg returned an error code (1)
google_compute_instance.vm[1] (remote-exec): --2020-11-02 17:31:55-- http://developer.download.nvidia.com/compute/machine-learning/repos/ubuntu1804/x86_64/libcudnn7_7.6.5.32-1+cuda10.1_amd64.deb
google_compute_instance.vm[1] (remote-exec): Resolving developer.download.nvidia.com (developer.download.nvidia.com)... 152.195.19.142
google_compute_instance.vm[1] (remote-exec): Connecting to developer.download.nvidia.com (developer.download.nvidia.com)|152.195.19.142|:80... connected.
google_compute_instance.vm[1] (remote-exec): HTTP request sent, awaiting response... 200 OK
google_compute_instance.vm[1] (remote-exec): Length: 182313188 (174M) [application/x-deb]
google_compute_instance.vm[1] (remote-exec): Saving to: ‘libcudnn7_7.6.5.32-1+cuda10.1_amd64.deb’
google_compute_instance.vm[1] (remote-exec): libcu 0% 0 --.-KB/s
google_compute_instance.vm[1] (remote-exec): libcud 19% 34.65M 173MB/s
google_compute_instance.vm[1] (remote-exec): libcudn 53% 92.31M 231MB/s
google_compute_instance.vm[1] (remote-exec): libcudnn 86% 150.05M 250MB/s
google_compute_instance.vm[1] (remote-exec): libcudnn7_7 100% 173.87M 255MB/s in 0.7s
google_compute_instance.vm[1] (remote-exec): 2020-11-02 17:31:56 (255 MB/s) - ‘libcudnn7_7.6.5.32-1+cuda10.1_amd64.deb’ saved [182313188/182313188]
google_compute_instance.vm[1]: Still creating... [6m40s elapsed]
google_compute_instance.vm[1]: Still creating... [6m50s elapsed]
google_compute_instance.vm[1] (remote-exec): --2020-11-02 17:32:14-- http://developer.download.nvidia.com/compute/machine-learning/repos/ubuntu1804/x86_64/libcudnn7-dev_7.6.5.32-1+cuda10.1_amd64.deb
google_compute_instance.vm[1] (remote-exec): Resolving developer.download.nvidia.com (developer.download.nvidia.com)... 152.195.19.142
google_compute_instance.vm[1] (remote-exec): Connecting to developer.download.nvidia.com (developer.download.nvidia.com)|152.195.19.142|:80... connected.
google_compute_instance.vm[1] (remote-exec): HTTP request sent, awaiting response... 200 OK
google_compute_instance.vm[1] (remote-exec): Length: 160506208 (153M) [application/x-deb]
google_compute_instance.vm[1] (remote-exec): Saving to: ‘libcudnn7-dev_7.6.5.32-1+cuda10.1_amd64.deb’
google_compute_instance.vm[1] (remote-exec): libcu 0% 0 --.-KB/s
google_compute_instance.vm[1] (remote-exec): libcud 24% 37.56M 188MB/s
google_compute_instance.vm[1] (remote-exec): libcudn 61% 94.50M 236MB/s
google_compute_instance.vm[1] (remote-exec): libcudnn 99% 151.87M 253MB/s
google_compute_instance.vm[1] (remote-exec): libcudnn7-d 100% 153.07M 253MB/s in 0.6s
google_compute_instance.vm[1] (remote-exec): 2020-11-02 17:32:15 (253 MB/s) - ‘libcudnn7-dev_7.6.5.32-1+cuda10.1_amd64.deb’ saved [160506208/160506208]
google_compute_instance.vm[1]: Still creating... [7m0s elapsed]
google_compute_instance.vm[1]: Still creating... [7m10s elapsed]
google_compute_instance.vm[1]: Still creating... [7m20s elapsed]
google_compute_instance.vm[1] (remote-exec): Extracting templates from packages: 14%
google_compute_instance.vm[1] (remote-exec): Extracting templates from packages: 28%
google_compute_instance.vm[1] (remote-exec): Extracting templates from packages: 42%
google_compute_instance.vm[1] (remote-exec): Extracting templates from packages: 56%
google_compute_instance.vm[1] (remote-exec): Extracting templates from packages: 70%
google_compute_instance.vm[1] (remote-exec): Extracting templates from packages: 84%
google_compute_instance.vm[1] (remote-exec): Extracting templates from packages: 98%
google_compute_instance.vm[1] (remote-exec): Extracting templates from packages: 100%
google_compute_instance.vm[1]: Still creating... [7m30s elapsed]
google_compute_instance.vm[1]: Still creating... [7m40s elapsed]
google_compute_instance.vm[1]: Still creating... [7m50s elapsed]
google_compute_instance.vm[1]: Still creating... [8m0s elapsed]
google_compute_instance.vm[1] (remote-exec): No apport report written because the error message indicates its a followup error from a previous failure.
google_compute_instance.vm[1] (remote-exec): No apport report written because the error message indicates its a followup error from a previous failure.
google_compute_instance.vm[1] (remote-exec): No apport report written because MaxReports is reached already
google_compute_instance.vm[1] (remote-exec): No apport report written because MaxReports is reached already
google_compute_instance.vm[1] (remote-exec): No apport report written because MaxReports is reached already
google_compute_instance.vm[1]: Still creating... [8m10s elapsed]
google_compute_instance.vm[1]: Still creating... [8m20s elapsed]
google_compute_instance.vm[1] (remote-exec): E: Sub-process /usr/bin/dpkg returned an error code (1)
google_compute_instance.vm[1] (remote-exec): WARNING: The scripts pip, pip3 and pip3.8 are installed in '/home/cfurtado/.local/bin' which is not on PATH.
google_compute_instance.vm[1] (remote-exec): Consider adding this directory to PATH or, if you prefer to suppress this warning, use --no-warn-script-location.
google_compute_instance.vm[1] (remote-exec): ERROR: launchpadlib 1.10.13 requires testresources, which is not installed.
google_compute_instance.vm[1] (remote-exec): WARNING: The scripts easy_install and easy_install-3.8 are installed in '/home/cfurtado/.local/bin' which is not on PATH.
google_compute_instance.vm[1] (remote-exec): Consider adding this directory to PATH or, if you prefer to suppress this warning, use --no-warn-script-location.
google_compute_instance.vm[1] (remote-exec): WARNING: Skipping crcmod as it is not installed.
google_compute_instance.vm[1]: Still creating... [8m30s elapsed]
google_compute_instance.vm[1]: Still creating... [8m40s elapsed]
google_compute_instance.vm[1]: Still creating... [8m50s elapsed]
google_compute_instance.vm[1]: Still creating... [9m0s elapsed]
google_compute_instance.vm[1] (remote-exec): WARNING: Retrying (Retry(total=4, connect=None, read=None, redirect=None, status=None)) after connection broken by 'ProtocolError('Connection aborted.', OSError("(104, 'ECONNRESET')"))': /packages/bc/58/0aa6fb779dc69cfc811df3398fcbeaeefbf18561b6e36b185df0782781cc/absl_py-0.11.0-py3-none-any.whl
google_compute_instance.vm[1]: Still creating... [9m10s elapsed]
google_compute_instance.vm[1]: Still creating... [9m20s elapsed]
google_compute_instance.vm[1]: Still creating... [9m30s elapsed]
google_compute_instance.vm[1]: Still creating... [9m40s elapsed]
google_compute_instance.vm[1]: Still creating... [9m50s elapsed]
google_compute_instance.vm[1]: Still creating... [10m0s elapsed]
google_compute_instance.vm[1]: Still creating... [10m10s elapsed]
google_compute_instance.vm[1]: Still creating... [10m20s elapsed]
google_compute_instance.vm[1]: Still creating... [10m30s elapsed]
google_compute_instance.vm[1]: Still creating... [10m40s elapsed]
google_compute_instance.vm[1]: Still creating... [10m50s elapsed]
google_compute_instance.vm[1]: Still creating... [11m0s elapsed]
google_compute_instance.vm[1]: Still creating... [11m10s elapsed]
google_compute_instance.vm[1]: Still creating... [11m20s elapsed]
google_compute_instance.vm[1]: Still creating... [11m30s elapsed]
google_compute_instance.vm[1]: Still creating... [11m40s elapsed]
google_compute_instance.vm[1]: Still creating... [11m50s elapsed]
google_compute_instance.vm[1]: Still creating... [12m0s elapsed]
google_compute_instance.vm[1]: Still creating... [12m10s elapsed]
google_compute_instance.vm[1]: Still creating... [12m20s elapsed]
google_compute_instance.vm[1]: Still creating... [12m30s elapsed]
google_compute_instance.vm[1]: Still creating... [12m40s elapsed]
google_compute_instance.vm[1]: Still creating... [12m50s elapsed]
google_compute_instance.vm[1]: Still creating... [13m0s elapsed]
google_compute_instance.vm[1]: Still creating... [13m10s elapsed]
google_compute_instance.vm[1]: Still creating... [13m20s elapsed]
google_compute_instance.vm[1]: Still creating... [13m30s elapsed]
google_compute_instance.vm[1]: Still creating... [13m40s elapsed]
google_compute_instance.vm[1]: Still creating... [13m50s elapsed]
google_compute_instance.vm[1]: Still creating... [14m0s elapsed]
google_compute_instance.vm[1]: Still creating... [14m10s elapsed]
google_compute_instance.vm[1]: Still creating... [14m20s elapsed]
google_compute_instance.vm[1]: Still creating... [14m30s elapsed]
google_compute_instance.vm[1]: Still creating... [14m40s elapsed]
google_compute_instance.vm[1]: Still creating... [14m50s elapsed]
google_compute_instance.vm[1]: Still creating... [15m0s elapsed]
google_compute_instance.vm[1]: Still creating... [15m10s elapsed]
google_compute_instance.vm[1]: Still creating... [15m20s elapsed]
google_compute_instance.vm[1]: Still creating... [15m30s elapsed]
google_compute_instance.vm[1]: Still creating... [15m40s elapsed]
google_compute_instance.vm[1]: Still creating... [15m50s elapsed]
google_compute_instance.vm[1]: Still creating... [16m0s elapsed]
google_compute_instance.vm[1]: Still creating... [16m10s elapsed]
google_compute_instance.vm[1]: Still creating... [16m20s elapsed]
google_compute_instance.vm[1]: Still creating... [16m30s elapsed]
....... continues forever
$nvidia-smi
NVIDIA-SMI has failed because it couldn't communicate with the NVIDIA driver. Make sure that the latest NVIDIA driver is installed and running.
cuda version - ok
$ /usr/local/cuda/bin/nvcc --version
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2019 NVIDIA Corporation
Built on Sun_Jul_28_19:07:16_PDT_2019
Cuda compilation tools, release 10.1, V10.1.243
ubunto version - ok
$lsb_release -a
No LSB modules are available.
Distributor ID: Ubuntu
Description: Ubuntu 20.04.1 LTS
Release: 20.04
Codename: focal
python version
$ python3 --version
Python 3.8.2
cudnn version - ok 7.6.5:
$ cat /usr/include/x86_64-linux-gnu/cudnn_v*.h | grep CUDNN_MAJOR -A 2
#define CUDNN_MAJOR 7
#define CUDNN_MINOR 6
#define CUDNN_PATCHLEVEL 5
--
#define CUDNN_VERSION (CUDNN_MAJOR * 1000 + CUDNN_MINOR * 100 + CUDNN_PATCHLEVEL)
#include "driver_types.h"
boot_disk {
initialize_params {
image = "projects/ubuntu-os-cloud/global/images/ubuntu-1804-bionic-v20201014"
size = "${var.hard_drive_size_gp}"
type = "pd-ssd"
}
}
sudo apt-get update
sudo apt-get install -y build-essential
wget https://developer.download.nvidia.com/compute/cuda/repos/ubuntu1804/x86_64/cuda-repo-ubuntu1804_10.1.243-1_amd64.deb
sudo apt-key adv --fetch-keys https://developer.download.nvidia.com/compute/cuda/repos/ubuntu1804/x86_64/7fa2af80.pub
sudo dpkg -i cuda-repo-ubuntu1804_10.1.243-1_amd64.deb
sudo apt-get update
wget http://developer.download.nvidia.com/compute/machine-learning/repos/ubuntu1804/x86_64/nvidia-machine-learning-repo-ubuntu1804_1.0.0-1_amd64.deb
sudo apt install ./nvidia-machine-learning-repo-ubuntu1804_1.0.0-1_amd64.deb
sudo apt-get update
# Install NVIDIA driver
sudo apt-get -y install --no-install-recommends nvidia-driver-450
# Reboot. Check that GPUs are visible using the command: nvidia-smi
# Install development and runtime libraries (~4GB)
sudo apt-get -y install --no-install-recommends \
cuda-10-1 \
libcudnn7=7.6.5.32-1+cuda10.1 \
libcudnn7-dev=7.6.5.32-1+cuda10.1
google_compute_instance.vm[2]: Creating...
google_compute_instance.vm[2]: Still creating... [10s elapsed]
google_compute_instance.vm[2]: Still creating... [20s elapsed]
google_compute_instance.vm[2]: Still creating... [30s elapsed]
google_compute_instance.vm[2]: Provisioning with 'file'...
google_compute_instance.vm[2]: Still creating... [40s elapsed]
google_compute_instance.vm[2]: Still creating... [50s elapsed]
google_compute_instance.vm[2]: Still creating... [1m0s elapsed]
google_compute_instance.vm[2]: Provisioning with 'remote-exec'...
google_compute_instance.vm[2] (remote-exec): Connecting to remote host via SSH...
google_compute_instance.vm[2] (remote-exec): Host: 35.231.39.6
google_compute_instance.vm[2] (remote-exec): User: cfurtado
google_compute_instance.vm[2] (remote-exec): Password: false
google_compute_instance.vm[2] (remote-exec): Private key: true
google_compute_instance.vm[2] (remote-exec): Certificate: false
google_compute_instance.vm[2] (remote-exec): SSH Agent: false
google_compute_instance.vm[2] (remote-exec): Checking Host Key: false
google_compute_instance.vm[2] (remote-exec): Connected!
google_compute_instance.vm[2] (remote-exec): Running resource creation script... (this may take 10+ minutes)
google_compute_instance.vm[2]: Still creating... [1m10s elapsed]
google_compute_instance.vm[2] (remote-exec): E: Package 'build-essential' has no installation candidate
google_compute_instance.vm[2] (remote-exec): --2020-11-02 18:12:23-- https://developer.download.nvidia.com/compute/cuda/repos/ubuntu1804/x86_64/cuda-repo-ubuntu1804_10.1.243-1_amd64.deb
google_compute_instance.vm[2] (remote-exec): Resolving developer.download.nvidia.com (developer.download.nvidia.com)... 152.195.19.142
google_compute_instance.vm[2] (remote-exec): Connecting to developer.download.nvidia.com (developer.download.nvidia.com)|152.195.19.142|:443... connected.
google_compute_instance.vm[2] (remote-exec): HTTP request sent, awaiting response... 200 OK
google_compute_instance.vm[2] (remote-exec): Length: 2936 (2.9K) [application/x-deb]
google_compute_instance.vm[2] (remote-exec): Saving to: ‘cuda-repo-ubuntu1804_10.1.243-1_amd64.deb’
google_compute_instance.vm[2] (remote-exec): cuda- 0% 0 --.-KB/s
google_compute_instance.vm[2] (remote-exec): cuda-repo-u 100% 2.87K --.-KB/s in 0s
google_compute_instance.vm[2] (remote-exec): 2020-11-02 18:12:24 (171 MB/s) - ‘cuda-repo-ubuntu1804_10.1.243-1_amd64.deb’ saved [2936/2936]
google_compute_instance.vm[2] (remote-exec): Warning: apt-key output should not be parsed (stdout is not a terminal)
google_compute_instance.vm[2] (remote-exec): gpg: requesting key from 'https://developer.download.nvidia.com/compute/cuda/repos/ubuntu1804/x86_64/7fa2af80.pub'
google_compute_instance.vm[2] (remote-exec): gpg: key F60F4B3D7FA2AF80: public key "cudatools <cudatools@nvidia.com>" imported
google_compute_instance.vm[2] (remote-exec): gpg: Total number processed: 1
google_compute_instance.vm[2] (remote-exec): gpg: imported: 1
google_compute_instance.vm[2]: Still creating... [1m20s elapsed]
google_compute_instance.vm[2] (remote-exec): --2020-11-02 18:12:30-- http://developer.download.nvidia.com/compute/machine-learning/repos/ubuntu1804/x86_64/nvidia-machine-learning-repo-ubuntu1804_1.0.0-1_amd64.deb
google_compute_instance.vm[2] (remote-exec): Resolving developer.download.nvidia.com (developer.download.nvidia.com)... 152.195.19.142
google_compute_instance.vm[2] (remote-exec): Connecting to developer.download.nvidia.com (developer.download.nvidia.com)|152.195.19.142|:80... connected.
google_compute_instance.vm[2] (remote-exec): HTTP request sent, awaiting response... 200 OK
google_compute_instance.vm[2] (remote-exec): Length: 2926 (2.9K) [application/x-deb]
google_compute_instance.vm[2] (remote-exec): Saving to: ‘nvidia-machine-learning-repo-ubuntu1804_1.0.0-1_amd64.deb’
google_compute_instance.vm[2] (remote-exec): nvidi 0% 0 --.-KB/s
google_compute_instance.vm[2] (remote-exec): nvidia-mach 100% 2.86K --.-KB/s in 0s
google_compute_instance.vm[2] (remote-exec): 2020-11-02 18:12:31 (456 MB/s) - ‘nvidia-machine-learning-repo-ubuntu1804_1.0.0-1_amd64.deb’ saved [2926/2926]
google_compute_instance.vm[2] (remote-exec): WARNING: apt does not have a stable CLI interface. Use with caution in scripts.
google_compute_instance.vm[2] (remote-exec): Extracting templates from packages: 33%
google_compute_instance.vm[2] (remote-exec): Extracting templates from packages: 67%
google_compute_instance.vm[2] (remote-exec): Extracting templates from packages: 100%
google_compute_instance.vm[2]: Still creating... [1m30s elapsed]
google_compute_instance.vm[2]: Still creating... [1m40s elapsed]
google_compute_instance.vm[2]: Still creating... [1m50s elapsed]
google_compute_instance.vm[2]: Still creating... [2m0s elapsed]
google_compute_instance.vm[2]: Still creating... [2m10s elapsed]
google_compute_instance.vm[2]: Still creating... [2m20s elapsed]
google_compute_instance.vm[2]: Still creating... [2m30s elapsed]
google_compute_instance.vm[2]: Still creating... [2m40s elapsed]
google_compute_instance.vm[2]: Still creating... [2m50s elapsed]
google_compute_instance.vm[2]: Still creating... [3m1s elapsed]
google_compute_instance.vm[2]: Still creating... [3m11s elapsed]
google_compute_instance.vm[2]: Still creating... [3m21s elapsed]
google_compute_instance.vm[2]: Still creating... [3m31s elapsed]
google_compute_instance.vm[2] (remote-exec): Extracting templates from packages: 14%
google_compute_instance.vm[2] (remote-exec): Extracting templates from packages: 28%
google_compute_instance.vm[2] (remote-exec): Extracting templates from packages: 42%
google_compute_instance.vm[2] (remote-exec): Extracting templates from packages: 56%
google_compute_instance.vm[2] (remote-exec): Extracting templates from packages: 71%
google_compute_instance.vm[2] (remote-exec): Extracting templates from packages: 85%
google_compute_instance.vm[2] (remote-exec): Extracting templates from packages: 99%
google_compute_instance.vm[2] (remote-exec): Extracting templates from packages: 100%
google_compute_instance.vm[2]: Still creating... [3m41s elapsed]
google_compute_instance.vm[2]: Still creating... [3m51s elapsed]
google_compute_instance.vm[2]: Still creating... [4m1s elapsed]
google_compute_instance.vm[2]: Still creating... [4m11s elapsed]
google_compute_instance.vm[2]: Still creating... [4m21s elapsed]
google_compute_instance.vm[2]: Still creating... [4m31s elapsed]
google_compute_instance.vm[2]: Still creating... [4m41s elapsed]
google_compute_instance.vm[2]: Still creating... [4m51s elapsed]
google_compute_instance.vm[2]: Still creating... [5m1s elapsed]
google_compute_instance.vm[2]: Still creating... [5m11s elapsed]
google_compute_instance.vm[2]: Still creating... [5m21s elapsed]
google_compute_instance.vm[2]: Still creating... [5m31s elapsed]
google_compute_instance.vm[2]: Still creating... [5m41s elapsed]
google_compute_instance.vm[2]: Still creating... [5m51s elapsed]
google_compute_instance.vm[2]: Still creating... [6m1s elapsed]
google_compute_instance.vm[2]: Still creating... [6m11s elapsed]
google_compute_instance.vm[2]: Still creating... [6m21s elapsed]
google_compute_instance.vm[2]: Still creating... [6m31s elapsed]
google_compute_instance.vm[2]: Still creating... [6m41s elapsed]
google_compute_instance.vm[2]: Still creating... [6m51s elapsed]
google_compute_instance.vm[2]: Still creating... [7m1s elapsed]
google_compute_instance.vm[2]: Still creating... [7m11s elapsed]
google_compute_instance.vm[2]: Still creating... [7m21s elapsed]
google_compute_instance.vm[2]: Still creating... [7m31s elapsed]
google_compute_instance.vm[2]: Still creating... [7m41s elapsed]
google_compute_instance.vm[2] (remote-exec): Extracting templates from packages: 13%
google_compute_instance.vm[2] (remote-exec): Extracting templates from packages: 26%
google_compute_instance.vm[2] (remote-exec): Extracting templates from packages: 40%
google_compute_instance.vm[2] (remote-exec): Extracting templates from packages: 53%
google_compute_instance.vm[2] (remote-exec): Extracting templates from packages: 66%
google_compute_instance.vm[2] (remote-exec): Extracting templates from packages: 80%
google_compute_instance.vm[2] (remote-exec): Extracting templates from packages: 93%
google_compute_instance.vm[2] (remote-exec): Extracting templates from packages: 100%
google_compute_instance.vm[2]: Still creating... [7m51s elapsed]
google_compute_instance.vm[2]: Still creating... [8m1s elapsed]
google_compute_instance.vm[2]: Still creating... [8m11s elapsed]
google_compute_instance.vm[2]: Still creating... [8m21s elapsed]
google_compute_instance.vm[2] (remote-exec): WARNING: pip is being invoked by an old script wrapper. This will fail in a future version of pip.
google_compute_instance.vm[2] (remote-exec): Please see https://github.com/pypa/pip/issues/5599 for advice on fixing the underlying issue.
google_compute_instance.vm[2] (remote-exec): To avoid this problem you can invoke Python with '-m pip' instead of running pip directly.
google_compute_instance.vm[2] (remote-exec): WARNING: The scripts easy_install and easy_install-3.6 are installed in '/home/cfurtado/.local/bin' which is not on PATH.
google_compute_instance.vm[2] (remote-exec): Consider adding this directory to PATH or, if you prefer to suppress this warning, use --no-warn-script-location.
google_compute_instance.vm[2] (remote-exec): WARNING: pip is being invoked by an old script wrapper. This will fail in a future version of pip.
google_compute_instance.vm[2] (remote-exec): Please see https://github.com/pypa/pip/issues/5599 for advice on fixing the underlying issue.
google_compute_instance.vm[2] (remote-exec): To avoid this problem you can invoke Python with '-m pip' instead of running pip directly.
google_compute_instance.vm[2] (remote-exec): WARNING: Skipping crcmod as it is not installed.
google_compute_instance.vm[2] (remote-exec): WARNING: pip is being invoked by an old script wrapper. This will fail in a future version of pip.
google_compute_instance.vm[2] (remote-exec): Please see https://github.com/pypa/pip/issues/5599 for advice on fixing the underlying issue.
google_compute_instance.vm[2] (remote-exec): To avoid this problem you can invoke Python with '-m pip' instead of running pip directly.
google_compute_instance.vm[2] (remote-exec): WARNING: pip is being invoked by an old script wrapper. This will fail in a future version of pip.
google_compute_instance.vm[2] (remote-exec): Please see https://github.com/pypa/pip/issues/5599 for advice on fixing the underlying issue.
google_compute_instance.vm[2] (remote-exec): To avoid this problem you can invoke Python with '-m pip' instead of running pip directly.
google_compute_instance.vm[2] (remote-exec): WARNING: pip is being invoked by an old script wrapper. This will fail in a future version of pip.
google_compute_instance.vm[2] (remote-exec): Please see https://github.com/pypa/pip/issues/5599 for advice on fixing the underlying issue.
google_compute_instance.vm[2] (remote-exec): To avoid this problem you can invoke Python with '-m pip' instead of running pip directly.
google_compute_instance.vm[2]: Still creating... [8m31s elapsed]
google_compute_instance.vm[2]: Still creating... [8m41s elapsed]
google_compute_instance.vm[2]: Still creating... [8m51s elapsed]
google_compute_instance.vm[2]: Still creating... [9m1s elapsed]
google_compute_instance.vm[2]: Still creating... [9m11s elapsed]
google_compute_instance.vm[2]: Still creating... [9m21s elapsed]
google_compute_instance.vm[2]: Still creating... [9m31s elapsed]
google_compute_instance.vm[2]: Still creating... [9m41s elapsed]
google_compute_instance.vm[2]: Still creating... [9m51s elapsed]
google_compute_instance.vm[2]: Still creating... [10m1s elapsed]
google_compute_instance.vm[2]: Still creating... [10m11s elapsed]
google_compute_instance.vm[2]: Still creating... [10m21s elapsed]
google_compute_instance.vm[2]: Still creating... [10m31s elapsed]
google_compute_instance.vm[2]: Still creating... [10m41s elapsed]
google_compute_instance.vm[2]: Still creating... [10m51s elapsed]
google_compute_instance.vm[2]: Still creating... [11m1s elapsed]
google_compute_instance.vm[2]: Still creating... [11m11s elapsed]
google_compute_instance.vm[2]: Still creating... [11m21s elapsed]
google_compute_instance.vm[2]: Still creating... [11m31s elapsed]
google_compute_instance.vm[2]: Still creating... [11m41s elapsed]
google_compute_instance.vm[2]: Still creating... [11m51s elapsed]
google_compute_instance.vm[2]: Still creating... [12m1s elapsed]
google_compute_instance.vm[2]: Still creating... [12m11s elapsed]
google_compute_instance.vm[2]: Still creating... [12m21s elapsed]
google_compute_instance.vm[2]: Still creating... [12m31s elapsed]
google_compute_instance.vm[2]: Still creating... [12m41s elapsed]
google_compute_instance.vm[2]: Still creating... [12m51s elapsed]
google_compute_instance.vm[2]: Still creating... [13m1s elapsed]
google_compute_instance.vm[2]: Still creating... [13m11s elapsed]
google_compute_instance.vm[2]: Still creating... [13m21s elapsed]
google_compute_instance.vm[2]: Still creating... [13m31s elapsed]
google_compute_instance.vm[2]: Still creating... [13m41s elapsed]
google_compute_instance.vm[2]: Still creating... [13m51s elapsed]
google_compute_instance.vm[2]: Still creating... [14m1s elapsed]
google_compute_instance.vm[2]: Still creating... [14m11s elapsed]
google_compute_instance.vm[2]: Still creating... [14m21s elapsed]
google_compute_instance.vm[2]: Still creating... [14m31s elapsed]
google_compute_instance.vm[2]: Still creating... [14m41s elapsed]
google_compute_instance.vm[2]: Still creating... [14m51s elapsed]
google_compute_instance.vm[2]: Still creating... [15m1s elapsed]
google_compute_instance.vm[2]: Still creating... [15m11s elapsed]
google_compute_instance.vm[2]: Still creating... [15m21s elapsed]
google_compute_instance.vm[2]: Still creating... [15m31s elapsed]
google_compute_instance.vm[2]: Still creating... [15m41s elapsed]
google_compute_instance.vm[2]: Still creating... [15m51s elapsed]
google_compute_instance.vm[2]: Still creating... [16m1s elapsed]
google_compute_instance.vm[2]: Still creating... [16m11s elapsed]
.... does not finish......
stuck while installing the python packages does not run
$nvidia-smi
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 455.32.00 Driver Version: 455.32.00 CUDA Version: 11.1 |
|-------------------------------+----------------------+----------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|===============================+======================+======================|
| 0 Tesla P100-PCIE... Off | 00000000:00:04.0 Off | 0 |
| N/A 35C P0 29W / 250W | 0MiB / 16280MiB | 0% Default |
| | | N/A |
+-------------------------------+----------------------+----------------------+
+-----------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=============================================================================|
| No running processes found |
+-----------------------------------------------------------------------------+
cuda version - ok
$ /usr/local/cuda/bin/nvcc --version
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2019 NVIDIA Corporation
Built on Sun_Jul_28_19:07:16_PDT_2019
Cuda compilation tools, release 10.1, V10.1.243
ubunto version - ok
$lsb_release -a
No LSB modules are available.
Distributor ID: Ubuntu
Description: Ubuntu 18.04.5 LTS
Release: 18.04
Codename: bionic
python version
$ python3 --version
Python 3.6.9
cudnn version - ok 7.6.5:
$ cat /usr/include/x86_64-linux-gnu/cudnn_v*.h | grep CUDNN_MAJOR -A 2
#define CUDNN_MAJOR 7
#define CUDNN_MINOR 6
#define CUDNN_PATCHLEVEL 5
--
#define CUDNN_VERSION (CUDNN_MAJOR * 1000 + CUDNN_MINOR * 100 + CUDNN_PATCHLEVEL)
#include "driver_types.h"
boot_disk {
initialize_params {
image = "projects/ubuntu-os-cloud/global/images/ubuntu-1804-bionic-v20201014"
size = "${var.hard_drive_size_gp}"
type = "pd-ssd"
}
}
sudo apt-get update
sudo apt-get install -y build-essential
wget https://developer.download.nvidia.com/compute/cuda/repos/ubuntu1804/x86_64/cuda-ubuntu1804.pin
sudo mv cuda-ubuntu1804.pin /etc/apt/preferences.d/cuda-repository-pin-600
wget http://developer.download.nvidia.com/compute/cuda/10.2/Prod/local_installers/cuda-repo-ubuntu1804-10-2-local-10.2.89-440.33.01_1.0-1_amd64.deb
sudo dpkg -i cuda-repo-ubuntu1804-10-2-local-10.2.89-440.33.01_1.0-1_amd64.deb
sudo apt-key add /var/cuda-repo-10-2-local-10.2.89-440.33.01/7fa2af80.pub
sudo apt-get update
sudo apt-get -y install cuda
export PATH=/usr/local/cuda-10.1/bin${PATH:+:${PATH}}
# install cudnn
wget http://developer.download.nvidia.com/compute/machine-learning/repos/ubuntu1804/x86_64/libcudnn7_7.6.5.32-1+cuda10.1_amd64.deb
sudo dpkg -i libcudnn7_7.6.5.32-1+cuda10.1_amd64.deb
wget http://developer.download.nvidia.com/compute/machine-learning/repos/ubuntu1804/x86_64/libcudnn7-dev_7.6.5.32-1+cuda10.1_amd64.deb
sudo dpkg -i libcudnn7-dev_7.6.5.32-1+cuda10.1_amd64.deb
google_compute_instance.vm[2]: Creating...
google_compute_instance.vm[2]: Still creating... [10s elapsed]
google_compute_instance.vm[2]: Still creating... [20s elapsed]
google_compute_instance.vm[2]: Still creating... [30s elapsed]
google_compute_instance.vm[2]: Provisioning with 'file'...
google_compute_instance.vm[2]: Still creating... [40s elapsed]
google_compute_instance.vm[2]: Still creating... [50s elapsed]
google_compute_instance.vm[2]: Still creating... [1m0s elapsed]
google_compute_instance.vm[2]: Still creating... [1m10s elapsed]
google_compute_instance.vm[2]: Still creating... [1m20s elapsed]
google_compute_instance.vm[2]: Provisioning with 'remote-exec'...
google_compute_instance.vm[2] (remote-exec): Connecting to remote host via SSH...
google_compute_instance.vm[2] (remote-exec): Host: 35.231.39.6
google_compute_instance.vm[2] (remote-exec): User: cfurtado
google_compute_instance.vm[2] (remote-exec): Password: false
google_compute_instance.vm[2] (remote-exec): Private key: true
google_compute_instance.vm[2] (remote-exec): Certificate: false
google_compute_instance.vm[2] (remote-exec): SSH Agent: false
google_compute_instance.vm[2] (remote-exec): Checking Host Key: false
google_compute_instance.vm[2] (remote-exec): Connected!
google_compute_instance.vm[2] (remote-exec): Running resource creation script... (this may take 10+ minutes)
google_compute_instance.vm[2]: Still creating... [1m30s elapsed]
google_compute_instance.vm[2] (remote-exec): Extracting templates from packages: 75%
google_compute_instance.vm[2] (remote-exec): Extracting templates from packages: 100%
google_compute_instance.vm[2]: Still creating... [1m40s elapsed]
google_compute_instance.vm[2]: Still creating... [1m50s elapsed]
google_compute_instance.vm[2] (remote-exec): --2020-11-02 19:26:36-- https://developer.download.nvidia.com/compute/cuda/repos/ubuntu1804/x86_64/cuda-ubuntu1804.pin
google_compute_instance.vm[2] (remote-exec): Resolving developer.download.nvidia.com (developer.download.nvidia.com)... 152.195.19.142
google_compute_instance.vm[2] (remote-exec): Connecting to developer.download.nvidia.com (developer.download.nvidia.com)|152.195.19.142|:443... connected.
google_compute_instance.vm[2] (remote-exec): HTTP request sent, awaiting response... 200 OK
google_compute_instance.vm[2] (remote-exec): Length: 190 [application/octet-stream]
google_compute_instance.vm[2] (remote-exec): Saving to: ‘cuda-ubuntu1804.pin’
google_compute_instance.vm[2] (remote-exec): cuda- 0% 0 --.-KB/s
google_compute_instance.vm[2] (remote-exec): cuda-ubuntu 100% 190 --.-KB/s in 0s
google_compute_instance.vm[2] (remote-exec): 2020-11-02 19:26:37 (5.87 MB/s) - ‘cuda-ubuntu1804.pin’ saved [190/190]
google_compute_instance.vm[2] (remote-exec): --2020-11-02 19:26:37-- http://developer.download.nvidia.com/compute/cuda/10.2/Prod/local_installers/cuda-repo-ubuntu1804-10-2-local-10.2.89-440.33.01_1.0-1_amd64.deb
google_compute_instance.vm[2] (remote-exec): Resolving developer.download.nvidia.com (developer.download.nvidia.com)... 152.195.19.142
google_compute_instance.vm[2] (remote-exec): Connecting to developer.download.nvidia.com (developer.download.nvidia.com)|152.195.19.142|:80... connected.
google_compute_instance.vm[2] (remote-exec): HTTP request sent, awaiting response... 200 OK
google_compute_instance.vm[2] (remote-exec): Length: 1896270068 (1.8G) [application/x-deb]
google_compute_instance.vm[2] (remote-exec): Saving to: ‘cuda-repo-ubuntu1804-10-2-local-10.2.89-440.33.01_1.0-1_amd64.deb’
google_compute_instance.vm[2] (remote-exec): cuda- 0% 0 --.-KB/s
google_compute_instance.vm[2] (remote-exec): cuda-r 2% 36.79M 184MB/s
google_compute_instance.vm[2] (remote-exec): cuda-re 5% 92.98M 232MB/s
google_compute_instance.vm[2] (remote-exec): cuda-rep 8% 149.28M 249MB/s
google_compute_instance.vm[2] (remote-exec): cuda-repo 11% 204.85M 256MB/s
google_compute_instance.vm[2] (remote-exec): cuda-repo- 14% 261.52M 261MB/s
google_compute_instance.vm[2] (remote-exec): cuda-repo-u 17% 318.06M 265MB/s
google_compute_instance.vm[2] (remote-exec): uda-repo-ub 20% 374.38M 267MB/s
google_compute_instance.vm[2] (remote-exec): da-repo-ubu 23% 431.60M 270MB/s
google_compute_instance.vm[2] (remote-exec): a-repo-ubun 26% 487.65M 271MB/s
google_compute_instance.vm[2] (remote-exec): -repo-ubunt 30% 543.90M 272MB/s
google_compute_instance.vm[2] (remote-exec): repo-ubuntu 33% 600.77M 273MB/s
google_compute_instance.vm[2] (remote-exec): epo-ubuntu1 36% 657.77M 274MB/s
google_compute_instance.vm[2] (remote-exec): po-ubuntu18 39% 715.06M 275MB/s
google_compute_instance.vm[2] (remote-exec): o-ubuntu180 42% 772.61M 276MB/s
google_compute_instance.vm[2] (remote-exec): -ubuntu1804 45% 829.26M 276MB/s eta 4s
google_compute_instance.vm[2] (remote-exec): ubuntu1804- 49% 886.73M 283MB/s eta 4s
google_compute_instance.vm[2] (remote-exec): buntu1804-1 52% 943.84M 283MB/s eta 4s
google_compute_instance.vm[2] (remote-exec): untu1804-10 55% 1001M 284MB/s eta 4s
google_compute_instance.vm[2] (remote-exec): ntu1804-10- 58% 1.03G 283MB/s eta 4s
google_compute_instance.vm[2] (remote-exec): tu1804-10-2 61% 1.09G 284MB/s eta 3s
google_compute_instance.vm[2] (remote-exec): u1804-10-2- 64% 1.14G 284MB/s eta 3s
google_compute_instance.vm[2] (remote-exec): 1804-10-2-l 67% 1.20G 284MB/s eta 3s
google_compute_instance.vm[2] (remote-exec): 804-10-2-lo 70% 1.25G 284MB/s eta 3s
google_compute_instance.vm[2] (remote-exec): 04-10-2-loc 74% 1.31G 284MB/s eta 3s
google_compute_instance.vm[2] (remote-exec): 4-10-2-loca 77% 1.36G 284MB/s eta 1s
google_compute_instance.vm[2] (remote-exec): -10-2-local 80% 1.42G 284MB/s eta 1s
google_compute_instance.vm[2] (remote-exec): 10-2-local- 83% 1.47G 284MB/s eta 1s
google_compute_instance.vm[2] (remote-exec): 0-2-local-1 86% 1.53G 284MB/s eta 1s
google_compute_instance.vm[2] (remote-exec): -2-local-10 89% 1.59G 284MB/s eta 1s
google_compute_instance.vm[2] (remote-exec): 2-local-10. 92% 1.64G 284MB/s eta 0s
google_compute_instance.vm[2] (remote-exec): -local-10.2 96% 1.70G 284MB/s eta 0s
google_compute_instance.vm[2] (remote-exec): local-10.2. 99% 1.75G 284MB/s eta 0s
google_compute_instance.vm[2] (remote-exec): cuda-repo-u 100% 1.77G 284MB/s in 6.4s
google_compute_instance.vm[2] (remote-exec): 2020-11-02 19:26:43 (280 MB/s) - ‘cuda-repo-ubuntu1804-10-2-local-10.2.89-440.33.01_1.0-1_amd64.deb’ saved [1896270068/1896270068]
google_compute_instance.vm[2]: Still creating... [2m0s elapsed]
google_compute_instance.vm[2]: Still creating... [2m10s elapsed]
google_compute_instance.vm[2] (remote-exec): Warning: apt-key output should not be parsed (stdout is not a terminal)
google_compute_instance.vm[2]: Still creating... [2m20s elapsed]
google_compute_instance.vm[2]: Still creating... [2m30s elapsed]
google_compute_instance.vm[2]: Still creating... [2m40s elapsed]
google_compute_instance.vm[2] (remote-exec): Extracting templates from packages: 11%
google_compute_instance.vm[2] (remote-exec): Extracting templates from packages: 22%
google_compute_instance.vm[2] (remote-exec): Extracting templates from packages: 33%
google_compute_instance.vm[2] (remote-exec): Extracting templates from packages: 44%
google_compute_instance.vm[2] (remote-exec): Extracting templates from packages: 55%
google_compute_instance.vm[2] (remote-exec): Extracting templates from packages: 66%
google_compute_instance.vm[2] (remote-exec): Extracting templates from packages: 77%
google_compute_instance.vm[2] (remote-exec): Extracting templates from packages: 88%
google_compute_instance.vm[2] (remote-exec): Extracting templates from packages: 100%
google_compute_instance.vm[2]: Still creating... [2m50s elapsed]
google_compute_instance.vm[2]: Still creating... [3m0s elapsed]
google_compute_instance.vm[2]: Still creating... [3m10s elapsed]
google_compute_instance.vm[2]: Still creating... [3m20s elapsed]
google_compute_instance.vm[2]: Still creating... [3m30s elapsed]
google_compute_instance.vm[2]: Still creating... [3m40s elapsed]
google_compute_instance.vm[2]: Still creating... [3m50s elapsed]
google_compute_instance.vm[2]: Still creating... [4m0s elapsed]
google_compute_instance.vm[2]: Still creating... [4m10s elapsed]
google_compute_instance.vm[2]: Still creating... [4m20s elapsed]
google_compute_instance.vm[2]: Still creating... [4m30s elapsed]
google_compute_instance.vm[2]: Still creating... [4m40s elapsed]
google_compute_instance.vm[2]: Still creating... [4m50s elapsed]
google_compute_instance.vm[2]: Still creating... [5m0s elapsed]
google_compute_instance.vm[2]: Still creating... [5m10s elapsed]
google_compute_instance.vm[2]: Still creating... [5m20s elapsed]
google_compute_instance.vm[2]: Still creating... [5m30s elapsed]
google_compute_instance.vm[2]: Still creating... [5m40s elapsed]
google_compute_instance.vm[2]: Still creating... [5m50s elapsed]
google_compute_instance.vm[2]: Still creating... [6m0s elapsed]
google_compute_instance.vm[2]: Still creating... [6m10s elapsed]
google_compute_instance.vm[2]: Still creating... [6m20s elapsed]
google_compute_instance.vm[2]: Still creating... [6m30s elapsed]
google_compute_instance.vm[2]: Still creating... [6m40s elapsed]
google_compute_instance.vm[2]: Still creating... [6m50s elapsed]
google_compute_instance.vm[2]: Still creating... [7m0s elapsed]
google_compute_instance.vm[2] (remote-exec): --2020-11-02 19:31:46-- http://developer.download.nvidia.com/compute/machine-learning/repos/ubuntu1804/x86_64/libcudnn7_7.6.5.32-1+cuda10.1_amd64.deb
google_compute_instance.vm[2] (remote-exec): Resolving developer.download.nvidia.com (developer.download.nvidia.com)... 152.195.19.142
google_compute_instance.vm[2] (remote-exec): Connecting to developer.download.nvidia.com (developer.download.nvidia.com)|152.195.19.142|:80... connected.
google_compute_instance.vm[2] (remote-exec): HTTP request sent, awaiting response... 200 OK
google_compute_instance.vm[2] (remote-exec): Length: 182313188 (174M) [application/x-deb]
google_compute_instance.vm[2] (remote-exec): Saving to: ‘libcudnn7_7.6.5.32-1+cuda10.1_amd64.deb’
google_compute_instance.vm[2] (remote-exec): libcu 0% 0 --.-KB/s
google_compute_instance.vm[2] (remote-exec): libcud 21% 37.74M 189MB/s
google_compute_instance.vm[2] (remote-exec): libcudn 52% 92.15M 230MB/s
google_compute_instance.vm[2] (remote-exec): libcudnn 85% 149.10M 248MB/s
google_compute_instance.vm[2] (remote-exec): libcudnn7_7 100% 173.87M 252MB/s in 0.7s
google_compute_instance.vm[2] (remote-exec): 2020-11-02 19:31:47 (252 MB/s) - ‘libcudnn7_7.6.5.32-1+cuda10.1_amd64.deb’ saved [182313188/182313188]
google_compute_instance.vm[2]: Still creating... [7m10s elapsed]
google_compute_instance.vm[2] (remote-exec): --2020-11-02 19:32:05-- http://developer.download.nvidia.com/compute/machine-learning/repos/ubuntu1804/x86_64/libcudnn7-dev_7.6.5.32-1+cuda10.1_amd64.deb
google_compute_instance.vm[2] (remote-exec): Resolving developer.download.nvidia.com (developer.download.nvidia.com)... 152.195.19.142
google_compute_instance.vm[2] (remote-exec): Connecting to developer.download.nvidia.com (developer.download.nvidia.com)|152.195.19.142|:80... connected.
google_compute_instance.vm[2]: Still creating... [7m20s elapsed]
google_compute_instance.vm[2] (remote-exec): HTTP request sent, awaiting response... 200 OK
google_compute_instance.vm[2] (remote-exec): Length: 160506208 (153M) [application/x-deb]
google_compute_instance.vm[2] (remote-exec): Saving to: ‘libcudnn7-dev_7.6.5.32-1+cuda10.1_amd64.deb’
google_compute_instance.vm[2] (remote-exec): libcu 0% 0 --.-KB/s
google_compute_instance.vm[2] (remote-exec): libcud 25% 38.42M 192MB/s
google_compute_instance.vm[2] (remote-exec): libcudn 62% 96.05M 240MB/s
google_compute_instance.vm[2] (remote-exec): libcudnn7-d 100% 153.07M 256MB/s in 0.6s
google_compute_instance.vm[2] (remote-exec): 2020-11-02 19:32:06 (256 MB/s) - ‘libcudnn7-dev_7.6.5.32-1+cuda10.1_amd64.deb’ saved [160506208/160506208]
google_compute_instance.vm[2]: Still creating... [7m30s elapsed]
google_compute_instance.vm[2]: Still creating... [7m40s elapsed]
google_compute_instance.vm[2] (remote-exec): Extracting templates from packages: 13%
google_compute_instance.vm[2] (remote-exec): Extracting templates from packages: 26%
google_compute_instance.vm[2] (remote-exec): Extracting templates from packages: 40%
google_compute_instance.vm[2] (remote-exec): Extracting templates from packages: 53%
google_compute_instance.vm[2] (remote-exec): Extracting templates from packages: 67%
google_compute_instance.vm[2] (remote-exec): Extracting templates from packages: 80%
google_compute_instance.vm[2] (remote-exec): Extracting templates from packages: 94%
google_compute_instance.vm[2] (remote-exec): Extracting templates from packages: 100%
google_compute_instance.vm[2]: Still creating... [7m50s elapsed]
google_compute_instance.vm[2]: Still creating... [8m0s elapsed]
google_compute_instance.vm[2]: Still creating... [8m10s elapsed]
google_compute_instance.vm[2]: Still creating... [8m20s elapsed]
google_compute_instance.vm[2] (remote-exec): WARNING: pip is being invoked by an old script wrapper. This will fail in a future version of pip.
google_compute_instance.vm[2] (remote-exec): Please see https://github.com/pypa/pip/issues/5599 for advice on fixing the underlying issue.
google_compute_instance.vm[2] (remote-exec): To avoid this problem you can invoke Python with '-m pip' instead of running pip directly.
google_compute_instance.vm[2] (remote-exec): WARNING: The scripts easy_install and easy_install-3.6 are installed in '/home/cfurtado/.local/bin' which is not on PATH.
google_compute_instance.vm[2] (remote-exec): Consider adding this directory to PATH or, if you prefer to suppress this warning, use --no-warn-script-location.
google_compute_instance.vm[2]: Still creating... [8m30s elapsed]
google_compute_instance.vm[2] (remote-exec): WARNING: pip is being invoked by an old script wrapper. This will fail in a future version of pip.
google_compute_instance.vm[2] (remote-exec): Please see https://github.com/pypa/pip/issues/5599 for advice on fixing the underlying issue.
google_compute_instance.vm[2] (remote-exec): To avoid this problem you can invoke Python with '-m pip' instead of running pip directly.
google_compute_instance.vm[2] (remote-exec): WARNING: Skipping crcmod as it is not installed.
google_compute_instance.vm[2] (remote-exec): WARNING: pip is being invoked by an old script wrapper. This will fail in a future version of pip.
google_compute_instance.vm[2] (remote-exec): Please see https://github.com/pypa/pip/issues/5599 for advice on fixing the underlying issue.
google_compute_instance.vm[2] (remote-exec): To avoid this problem you can invoke Python with '-m pip' instead of running pip directly.
google_compute_instance.vm[2] (remote-exec): WARNING: pip is being invoked by an old script wrapper. This will fail in a future version of pip.
google_compute_instance.vm[2] (remote-exec): Please see https://github.com/pypa/pip/issues/5599 for advice on fixing the underlying issue.
google_compute_instance.vm[2] (remote-exec): To avoid this problem you can invoke Python with '-m pip' instead of running pip directly.
google_compute_instance.vm[2] (remote-exec): WARNING: pip is being invoked by an old script wrapper. This will fail in a future version of pip.
google_compute_instance.vm[2] (remote-exec): Please see https://github.com/pypa/pip/issues/5599 for advice on fixing the underlying issue.
google_compute_instance.vm[2] (remote-exec): To avoid this problem you can invoke Python with '-m pip' instead of running pip directly.
google_compute_instance.vm[2]: Still creating... [8m40s elapsed]
google_compute_instance.vm[2]: Still creating... [8m50s elapsed]
google_compute_instance.vm[2]: Still creating... [9m0s elapsed]
google_compute_instance.vm[2]: Still creating... [9m10s elapsed]
google_compute_instance.vm[2]: Still creating... [9m20s elapsed]
google_compute_instance.vm[2]: Still creating... [9m30s elapsed]
google_compute_instance.vm[2]: Still creating... [9m40s elapsed]
google_compute_instance.vm[2]: Still creating... [9m50s elapsed]
google_compute_instance.vm[2]: Still creating... [10m0s elapsed]
google_compute_instance.vm[2]: Still creating... [10m10s elapsed]
google_compute_instance.vm[2]: Still creating... [10m20s elapsed]
google_compute_instance.vm[2]: Still creating... [10m30s elapsed]
google_compute_instance.vm[2]: Still creating... [10m40s elapsed]
google_compute_instance.vm[2]: Still creating... [10m50s elapsed]
google_compute_instance.vm[2]: Still creating... [11m0s elapsed]
google_compute_instance.vm[2]: Still creating... [11m10s elapsed]
google_compute_instance.vm[2]: Still creating... [11m20s elapsed]
google_compute_instance.vm[2]: Still creating... [11m30s elapsed]
google_compute_instance.vm[2]: Still creating... [11m40s elapsed]
google_compute_instance.vm[2]: Still creating... [11m50s elapsed]
google_compute_instance.vm[2]: Still creating... [12m0s elapsed]
google_compute_instance.vm[2]: Still creating... [12m10s elapsed]
google_compute_instance.vm[2]: Still creating... [12m20s elapsed]
google_compute_instance.vm[2]: Still creating... [12m30s elapsed]
google_compute_instance.vm[2]: Still creating... [12m40s elapsed]
google_compute_instance.vm[2]: Still creating... [12m50s elapsed]
google_compute_instance.vm[2]: Still creating... [13m0s elapsed]
google_compute_instance.vm[2]: Still creating... [13m10s elapsed]
google_compute_instance.vm[2]: Still creating... [13m20s elapsed]
google_compute_instance.vm[2]: Still creating... [13m30s elapsed]
google_compute_instance.vm[2]: Still creating... [13m40s elapsed]
google_compute_instance.vm[2]: Still creating... [13m50s elapsed]
google_compute_instance.vm[2]: Still creating... [14m0s elapsed]
google_compute_instance.vm[2]: Still creating... [14m10s elapsed]
google_compute_instance.vm[2]: Still creating... [14m20s elapsed]
google_compute_instance.vm[2]: Still creating... [14m30s elapsed]
google_compute_instance.vm[2]: Still creating... [14m40s elapsed]
google_compute_instance.vm[2]: Still creating... [14m50s elapsed]
google_compute_instance.vm[2]: Still creating... [15m0s elapsed]
google_compute_instance.vm[2]: Still creating... [15m10s elapsed]
google_compute_instance.vm[2]: Still creating... [15m20s elapsed]
google_compute_instance.vm[2]: Still creating... [15m30s elapsed]
google_compute_instance.vm[2]: Still creating... [15m40s elapsed]
google_compute_instance.vm[2]: Still creating... [15m50s elapsed]
google_compute_instance.vm[2]: Still creating... [16m0s elapsed]
google_compute_instance.vm[2]: Still creating... [16m10s elapsed]
google_compute_instance.vm[2]: Still creating... [16m20s elapsed]
google_compute_instance.vm[2]: Still creating... [16m30s elapsed]
google_compute_instance.vm[2]: Still creating... [16m40s elapsed]
google_compute_instance.vm[2]: Still creating... [16m50s elapsed]
google_compute_instance.vm[2]: Still creating... [17m0s elapsed]
google_compute_instance.vm[2]: Still creating... [17m10s elapsed]
google_compute_instance.vm[2]: Still creating... [17m20s elapsed]
google_compute_instance.vm[2]: Still creating... [17m30s elapsed]
google_compute_instance.vm[2]: Still creating... [17m40s elapsed]
google_compute_instance.vm[2]: Still creating... [17m50s elapsed]
google_compute_instance.vm[2]: Still creating... [18m0s elapsed]
google_compute_instance.vm[2]: Still creating... [18m10s elapsed]
google_compute_instance.vm[2]: Still creating... [18m20s elapsed]
google_compute_instance.vm[2]: Still creating... [18m30s elapsed]
google_compute_instance.vm[2]: Still creating... [18m40s elapsed]
google_compute_instance.vm[2]: Still creating... [18m50s elapsed]
google_compute_instance.vm[2]: Still creating... [19m0s elapsed]
google_compute_instance.vm[2]: Still creating... [19m10s elapsed]
google_compute_instance.vm[2]: Still creating... [19m20s elapsed]
google_compute_instance.vm[2]: Still creating... [19m30s elapsed]
google_compute_instance.vm[2]: Still creating... [19m40s elapsed]
google_compute_instance.vm[2]: Still creating... [19m50s elapsed]
google_compute_instance.vm[2]: Still creating... [20m0s elapsed]
google_compute_instance.vm[2]: Still creating... [20m10s elapsed]
google_compute_instance.vm[2]: Still creating... [20m20s elapsed]
google_compute_instance.vm[2]: Still creating... [20m30s elapsed]
google_compute_instance.vm[2]: Still creating... [20m40s elapsed]
google_compute_instance.vm[2]: Still creating... [20m50s elapsed]
google_compute_instance.vm[2]: Still creating... [21m0s elapsed]
google_compute_instance.vm[2]: Still creating... [21m10s elapsed]
google_compute_instance.vm[2]: Still creating... [21m20s elapsed]
google_compute_instance.vm[2]: Still creating... [21m30s elapsed]
google_compute_instance.vm[2]: Still creating... [21m40s elapsed]
google_compute_instance.vm[2]: Still creating... [21m50s elapsed]
google_compute_instance.vm[2]: Still creating... [22m0s elapsed]
google_compute_instance.vm[2]: Still creating... [22m10s elapsed]
google_compute_instance.vm[2]: Still creating... [22m20s elapsed]
google_compute_instance.vm[2]: Still creating... [22m30s elapsed]
google_compute_instance.vm[2]: Still creating... [22m40s elapsed]
google_compute_instance.vm[2]: Still creating... [22m50s elapsed]
google_compute_instance.vm[2]: Still creating... [23m0s elapsed]
google_compute_instance.vm[2]: Still creating... [23m10s elapsed]
google_compute_instance.vm[2]: Still creating... [23m20s elapsed]
google_compute_instance.vm[2]: Still creating... [23m30s elapsed]
google_compute_instance.vm[2]: Still creating... [23m40s elapsed]
google_compute_instance.vm[2]: Still creating... [23m50s elapsed]
google_compute_instance.vm[2]: Still creating... [24m0s elapsed]
google_compute_instance.vm[2]: Still creating... [24m10s elapsed]
google_compute_instance.vm[2]: Still creating... [24m20s elapsed]
google_compute_instance.vm[2]: Still creating... [24m30s elapsed]
google_compute_instance.vm[2]: Still creating... [24m40s elapsed]
google_compute_instance.vm[2]: Still creating... [24m50s elapsed]
google_compute_instance.vm[2]: Still creating... [25m0s elapsed]
google_compute_instance.vm[2]: Still creating... [25m10s elapsed]
google_compute_instance.vm[2]: Still creating... [25m20s elapsed]
google_compute_instance.vm[2]: Still creating... [25m30s elapsed]
google_compute_instance.vm[2]: Still creating... [25m40s elapsed]
google_compute_instance.vm[2]: Still creating... [25m50s elapsed]
google_compute_instance.vm[2]: Still creating... [26m0s elapsed]
google_compute_instance.vm[2]: Still creating... [26m10s elapsed]
google_compute_instance.vm[2]: Still creating... [32m54s elapsed]
google_compute_instance.vm[2]: Still creating... [33m4s elapsed]
google_compute_instance.vm[2]: Still creating... [33m14s elapsed]
google_compute_instance.vm[2] (remote-exec): WARNING: The scripts f2py, f2py3 and f2py3.6 are installed in '/home/cfurtado/.local/bin' which is not on PATH.
google_compute_instance.vm[2] (remote-exec): Consider adding this directory to PATH or, if you prefer to suppress this warning, use --no-warn-script-location.
google_compute_instance.vm[2] (remote-exec): WARNING: The script markdown_py is installed in '/home/cfurtado/.local/bin' which is not on PATH.
google_compute_instance.vm[2] (remote-exec): Consider adding this directory to PATH or, if you prefer to suppress this warning, use --no-warn-script-location.
google_compute_instance.vm[2] (remote-exec): WARNING: The scripts pyrsa-decrypt, pyrsa-encrypt, pyrsa-keygen, pyrsa-priv2pub, pyrsa-sign and pyrsa-verify are installed in '/home/cfurtado/.local/bin' which is not on PATH.
google_compute_instance.vm[2] (remote-exec): Consider adding this directory to PATH or, if you prefer to suppress this warning, use --no-warn-script-location.
google_compute_instance.vm[2] (remote-exec): WARNING: The script google-oauthlib-tool is installed in '/home/cfurtado/.local/bin' which is not on PATH.
google_compute_instance.vm[2] (remote-exec): Consider adding this directory to PATH or, if you prefer to suppress this warning, use --no-warn-script-location.
google_compute_instance.vm[2] (remote-exec): WARNING: The script tensorboard is installed in '/home/cfurtado/.local/bin' which is not on PATH.
google_compute_instance.vm[2] (remote-exec): Consider adding this directory to PATH or, if you prefer to suppress this warning, use --no-warn-script-location.
google_compute_instance.vm[2]: Still creating... [33m24s elapsed]
google_compute_instance.vm[2] (remote-exec): WARNING: The scripts estimator_ckpt_converter, saved_model_cli, tensorboard, tf_upgrade_v2, tflite_convert, toco and toco_from_protos are installed in '/home/cfurtado/.local/bin' which is not on PATH.
google_compute_instance.vm[2] (remote-exec): Consider adding this directory to PATH or, if you prefer to suppress this warning, use --no-warn-script-location.
google_compute_instance.vm[2] (remote-exec): WARNING: The scripts imageio_download_bin and imageio_remove_bin are installed in '/home/cfurtado/.local/bin' which is not on PATH.
google_compute_instance.vm[2] (remote-exec): Consider adding this directory to PATH or, if you prefer to suppress this warning, use --no-warn-script-location.
google_compute_instance.vm[2] (remote-exec): WARNING: The scripts lsm2bin and tifffile are installed in '/home/cfurtado/.local/bin' which is not on PATH.
google_compute_instance.vm[2] (remote-exec): Consider adding this directory to PATH or, if you prefer to suppress this warning, use --no-warn-script-location.
google_compute_instance.vm[2]: Still creating... [33m34s elapsed]
google_compute_instance.vm[2] (remote-exec): WARNING: The script skivi is installed in '/home/cfurtado/.local/bin' which is not on PATH.
google_compute_instance.vm[2] (remote-exec): Consider adding this directory to PATH or, if you prefer to suppress this warning, use --no-warn-script-location.
google_compute_instance.vm[2] (remote-exec): WARNING: The script pygmentize is installed in '/home/cfurtado/.local/bin' which is not on PATH.
google_compute_instance.vm[2] (remote-exec): Consider adding this directory to PATH or, if you prefer to suppress this warning, use --no-warn-script-location.
google_compute_instance.vm[2] (remote-exec): WARNING: The scripts iptest, iptest3, ipython and ipython3 are installed in '/home/cfurtado/.local/bin' which is not on PATH.
google_compute_instance.vm[2] (remote-exec): Consider adding this directory to PATH or, if you prefer to suppress this warning, use --no-warn-script-location.
google_compute_instance.vm[2] (remote-exec): WARNING: The scripts jupyter, jupyter-migrate and jupyter-troubleshoot are installed in '/home/cfurtado/.local/bin' which is not on PATH.
google_compute_instance.vm[2] (remote-exec): Consider adding this directory to PATH or, if you prefer to suppress this warning, use --no-warn-script-location.
google_compute_instance.vm[2] (remote-exec): WARNING: The script jupyter-trust is installed in '/home/cfurtado/.local/bin' which is not on PATH.
google_compute_instance.vm[2] (remote-exec): Consider adding this directory to PATH or, if you prefer to suppress this warning, use --no-warn-script-location.
google_compute_instance.vm[2] (remote-exec): WARNING: The scripts jupyter-kernel, jupyter-kernelspec and jupyter-run are installed in '/home/cfurtado/.local/bin' which is not on PATH.
google_compute_instance.vm[2] (remote-exec): Consider adding this directory to PATH or, if you prefer to suppress this warning, use --no-warn-script-location.
google_compute_instance.vm[2] (remote-exec): WARNING: The script jupyter-nbconvert is installed in '/home/cfurtado/.local/bin' which is not on PATH.
google_compute_instance.vm[2] (remote-exec): Consider adding this directory to PATH or, if you prefer to suppress this warning, use --no-warn-script-location.
google_compute_instance.vm[2] (remote-exec): WARNING: The script jupyter-console is installed in '/home/cfurtado/.local/bin' which is not on PATH.
google_compute_instance.vm[2] (remote-exec): Consider adding this directory to PATH or, if you prefer to suppress this warning, use --no-warn-script-location.
google_compute_instance.vm[2] (remote-exec): WARNING: The scripts jupyter-bundlerextension, jupyter-nbextension, jupyter-notebook and jupyter-serverextension are installed in '/home/cfurtado/.local/bin' which is not on PATH.
google_compute_instance.vm[2] (remote-exec): Consider adding this directory to PATH or, if you prefer to suppress this warning, use --no-warn-script-location.
google_compute_instance.vm[2]: Still creating... [33m44s elapsed]
google_compute_instance.vm[2] (remote-exec): ERROR: After October 2020 you may experience errors when installing or updating packages. This is because pip will change the way that it resolves dependency conflicts.
google_compute_instance.vm[2] (remote-exec): We recommend you use --use-feature=2020-resolver to test your packages with the new resolver before it becomes the default.
google_compute_instance.vm[2] (remote-exec): tensorflow-gpu 2.3.1 requires numpy<1.19.0,>=1.16.0, but you'll have numpy 1.19.4 which is incompatible.
google_compute_instance.vm[2]: Provisioning with 'remote-exec'...
google_compute_instance.vm[2] (remote-exec): Connecting to remote host via SSH...
google_compute_instance.vm[2] (remote-exec): Host: 35.231.39.6
google_compute_instance.vm[2] (remote-exec): User: cfurtado
google_compute_instance.vm[2] (remote-exec): Password: false
google_compute_instance.vm[2] (remote-exec): Private key: true
google_compute_instance.vm[2] (remote-exec): Certificate: false
google_compute_instance.vm[2] (remote-exec): SSH Agent: false
google_compute_instance.vm[2] (remote-exec): Checking Host Key: false
google_compute_instance.vm[2] (remote-exec): Connected!
google_compute_instance.vm[2]: Creation complete after 33m47s [id=projects/necstlab/zones/us-east1-c/instances/cfurtado-necstlab-2]
Warning: Applied changes may be incomplete
The plan was created with the -target option in effect, so some changes
requested in the configuration may have been ignored and the output values may
not be fully updated. Run the following command to verify that no other
changes are pending:
terraform plan
Note that the -target option is not suitable for routine use, and is provided
only for exceptional situations such [as](url) recovering from errors or mistakes, or
when Terraform specifically suggests to use it as part of an error message.
Apply complete! Resources: 1 added, 0 changed, 0 destroyed.
dlerror: libcudart.so.10.1
2020-11-02 20:06:09.697596: W tensorflow/stream_executor/platform/default/dso_loader.cc:59] Could not load dynamic library 'libcudart.so.10.1'; dlerror: libcudart.so.10.1: cannot
open shared object file: No such file or directory
2020-11-02 20:06:09.697641: I tensorflow/stream_executor/cuda/cudart_stub.cc:29] Ignore above cudart dlerror if you do not have a GPU set up on your machine.
$nvidia-smi
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 440.33.01 Driver Version: 440.33.01 CUDA Version: 10.2 |
|-------------------------------+----------------------+----------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
|===============================+======================+======================|
| 0 Tesla P100-PCIE... On | 00000000:00:04.0 Off | 0 |
| N/A 45C P0 27W / 250W | 10MiB / 16280MiB | 0% Default |
+-------------------------------+----------------------+----------------------+
+-----------------------------------------------------------------------------+
| Processes: GPU Memory |
| GPU PID Type Process name Usage |
|=============================================================================|
| No running processes found |
cuda version - ok
$ /usr/local/cuda/bin/nvcc --version
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2019 NVIDIA Corporation
Built on Wed_Oct_23_19:24:38_PDT_2019
Cuda compilation tools, release 10.2, V10.2.89
ubunto version - ok
$lsb_release -a
No LSB modules are available.
Distributor ID: Ubuntu
Description: Ubuntu 18.04.5 LTS
Release: 18.04
Codename: bionic
python version
$ python3 --version
Python 3.6.9
cudnn version - ok 7.6.5:
$ cat /usr/include/x86_64-linux-gnu/cudnn_v*.h | grep CUDNN_MAJOR -A 2
#define CUDNN_MAJOR 7
#define CUDNN_MINOR 6
#define CUDNN_PATCHLEVEL 5
--
#define CUDNN_VERSION (CUDNN_MAJOR * 1000 + CUDNN_MINOR * 100 + CUDNN_PATCHLEVEL)
#include "driver_types.h"
boot_disk {
initialize_params {
image = "projects/ubuntu-os-cloud/global/images/ubuntu-1804-bionic-v20201014"
size = "${var.hard_drive_size_gp}"
type = "pd-ssd"
}
}
sudo apt-get update
sudo apt-get install -y build-essential
wget https://developer.download.nvidia.com/compute/cuda/repos/ubuntu1804/x86_64/cuda-ubuntu1804.pin
sudo mv cuda-ubuntu1804.pin /etc/apt/preferences.d/cuda-repository-pin-600
wget http://developer.download.nvidia.com/compute/cuda/10.1/Prod/local_installers/cuda-repo-ubuntu1804-10-1-local-10.1.243-418.87.00_1.0-1_amd64.deb
sudo dpkg -i cuda-repo-ubuntu1804-10-1-local-10.1.243-418.87.00_1.0-1_amd64.deb
sudo apt-key add /var/cuda-repo-10-1-local-10.1.243-418.87.00/7fa2af80.pub
sudo apt-get update
sudo apt-get -y install cuda
export PATH=/usr/local/cuda-10.1/bin${PATH:+:${PATH}}
# install cudnn
wget http://developer.download.nvidia.com/compute/machine-learning/repos/ubuntu1804/x86_64/libcudnn7_7.6.5.32-1+cuda10.1_amd64.deb
sudo dpkg -i libcudnn7_7.6.5.32-1+cuda10.1_amd64.deb
wget http://developer.download.nvidia.com/compute/machine-learning/repos/ubuntu1804/x86_64/libcudnn7-dev_7.6.5.32-1+cuda10.1_amd64.deb
sudo dpkg -i libcudnn7-dev_7.6.5.32-1+cuda10.1_amd64.deb
sudo reboot
google_compute_instance.vm[0]: Creating...
google_compute_instance.vm[0]: Still creating... [10s elapsed]
google_compute_instance.vm[0]: Still creating... [20s elapsed]
google_compute_instance.vm[0]: Still creating... [30s elapsed]
google_compute_instance.vm[0]: Provisioning with 'file'...
google_compute_instance.vm[0]: Still creating... [40s elapsed]
google_compute_instance.vm[0]: Still creating... [50s elapsed]
google_compute_instance.vm[0]: Still creating... [1m0s elapsed]
google_compute_instance.vm[0]: Still creating... [1m10s elapsed]
google_compute_instance.vm[0]: Still creating... [1m20s elapsed]
google_compute_instance.vm[0]: Provisioning with 'remote-exec'...
google_compute_instance.vm[0] (remote-exec): Connecting to remote host via SSH...
google_compute_instance.vm[0] (remote-exec): Host: 34.74.210.177
google_compute_instance.vm[0] (remote-exec): User: cfurtado
google_compute_instance.vm[0] (remote-exec): Password: false
google_compute_instance.vm[0] (remote-exec): Private key: true
google_compute_instance.vm[0] (remote-exec): Certificate: false
google_compute_instance.vm[0] (remote-exec): SSH Agent: false
google_compute_instance.vm[0] (remote-exec): Checking Host Key: false
google_compute_instance.vm[0] (remote-exec): Connected!
google_compute_instance.vm[0] (remote-exec): Running resource creation script... (this may take 10+ minutes)
google_compute_instance.vm[0]: Still creating... [1m30s elapsed]
google_compute_instance.vm[0] (remote-exec): Extracting templates from packages: 73%
google_compute_instance.vm[0] (remote-exec): Extracting templates from packages: 100%
google_compute_instance.vm[0]: Still creating... [1m40s elapsed]
google_compute_instance.vm[0]: Still creating... [1m50s elapsed]
google_compute_instance.vm[0] (remote-exec): --2020-11-03 20:04:10-- https://developer.download.nvidia.com/compute/cuda/repos/ubuntu1804/x86_64/cuda-ubuntu1804.pin
google_compute_instance.vm[0] (remote-exec): Resolving developer.download.nvidia.com (developer.download.nvidia.com)... 152.195.19.142
google_compute_instance.vm[0] (remote-exec): Connecting to developer.download.nvidia.com (developer.download.nvidia.com)|152.195.19.142|:443... connected.
google_compute_instance.vm[0] (remote-exec): HTTP request sent, awaiting response... 200 OK
google_compute_instance.vm[0] (remote-exec): Length: 190 [application/octet-stream]
google_compute_instance.vm[0] (remote-exec): Saving to: ‘cuda-ubuntu1804.pin’
google_compute_instance.vm[0] (remote-exec): cuda- 0% 0 --.-KB/s
google_compute_instance.vm[0] (remote-exec): cuda-ubuntu 100% 190 --.-KB/s in 0s
google_compute_instance.vm[0] (remote-exec): 2020-11-03 20:04:11 (3.48 MB/s) - ‘cuda-ubuntu1804.pin’ saved [190/190]
google_compute_instance.vm[0] (remote-exec): --2020-11-03 20:04:11-- http://developer.download.nvidia.com/compute/cuda/10.1/Prod/local_installers/cuda-repo-ubuntu1804-10-1-local-10.1.243-418.87.00_1.0-1_amd64.deb
google_compute_instance.vm[0] (remote-exec): Resolving developer.download.nvidia.com (developer.download.nvidia.com)... 152.195.19.142
google_compute_instance.vm[0] (remote-exec): Connecting to developer.download.nvidia.com (developer.download.nvidia.com)|152.195.19.142|:80... connected.
google_compute_instance.vm[0] (remote-exec): HTTP request sent, awaiting response... 200 OK
google_compute_instance.vm[0] (remote-exec): Length: 1859785444 (1.7G) [application/x-deb]
google_compute_instance.vm[0] (remote-exec): Saving to: ‘cuda-repo-ubuntu1804-10-1-local-10.1.243-418.87.00_1.0-1_amd64.deb’
google_compute_instance.vm[0] (remote-exec): cuda- 0% 0 --.-KB/s
google_compute_instance.vm[0] (remote-exec): cuda-r 2% 35.78M 179MB/s
google_compute_instance.vm[0] (remote-exec): cuda-re 5% 91.98M 230MB/s
google_compute_instance.vm[0] (remote-exec): cuda-rep 8% 148.50M 247MB/s
google_compute_instance.vm[0] (remote-exec): cuda-repo 11% 205.27M 257MB/s
google_compute_instance.vm[0] (remote-exec): cuda-repo- 14% 261.36M 261MB/s
google_compute_instance.vm[0] (remote-exec): cuda-repo-u 17% 318.37M 265MB/s
google_compute_instance.vm[0] (remote-exec): uda-repo-ub 21% 374.00M 267MB/s
google_compute_instance.vm[0] (remote-exec): da-repo-ubu 24% 429.86M 269MB/s
google_compute_instance.vm[0] (remote-exec): a-repo-ubun 27% 486.12M 270MB/s
google_compute_instance.vm[0] (remote-exec): -repo-ubunt 30% 543.05M 271MB/s
google_compute_instance.vm[0] (remote-exec): repo-ubuntu 33% 599.42M 272MB/s
google_compute_instance.vm[0] (remote-exec): epo-ubuntu1 36% 656.23M 273MB/s
google_compute_instance.vm[0] (remote-exec): po-ubuntu18 40% 712.70M 274MB/s
google_compute_instance.vm[0] (remote-exec): o-ubuntu180 43% 769.66M 275MB/s
google_compute_instance.vm[0] (remote-exec): -ubuntu1804 46% 826.18M 275MB/s eta 3s
google_compute_instance.vm[0] (remote-exec): ubuntu1804- 49% 882.84M 282MB/s eta 3s
google_compute_instance.vm[0] (remote-exec): buntu1804-1 52% 939.10M 282MB/s eta 3s
google_compute_instance.vm[0] (remote-exec): untu1804-10 56% 995.23M 282MB/s eta 3s
google_compute_instance.vm[0] (remote-exec): ntu1804-10- 59% 1.02G 281MB/s eta 3s
google_compute_instance.vm[0] (remote-exec): tu1804-10-1 62% 1.08G 280MB/s eta 2s
google_compute_instance.vm[0] (remote-exec): u1804-10-1- 65% 1.13G 280MB/s eta 2s
google_compute_instance.vm[0] (remote-exec): 1804-10-1-l 68% 1.19G 280MB/s eta 2s
google_compute_instance.vm[0] (remote-exec): 804-10-1-lo 71% 1.24G 280MB/s eta 2s
google_compute_instance.vm[0] (remote-exec): 04-10-1-loc 74% 1.30G 281MB/s eta 2s
google_compute_instance.vm[0] (remote-exec): 4-10-1-loca 77% 1.35G 280MB/s eta 1s
google_compute_instance.vm[0] (remote-exec): -10-1-local 80% 1.40G 278MB/s eta 1s
google_compute_instance.vm[0] (remote-exec): 10-1-local- 83% 1.45G 278MB/s eta 1s
google_compute_instance.vm[0] (remote-exec): 0-1-local-1 86% 1.51G 277MB/s eta 1s
google_compute_instance.vm[0] (remote-exec): -1-local-10 90% 1.56G 277MB/s eta 1s
google_compute_instance.vm[0] (remote-exec): 1-local-10. 93% 1.62G 277MB/s eta 0s
google_compute_instance.vm[0]: Still creating... [2m0s elapsed]
google_compute_instance.vm[0] (remote-exec): -local-10.1 96% 1.67G 276MB/s eta 0s
google_compute_instance.vm[0] (remote-exec): local-10.1. 99% 1.72G 276MB/s eta 0s
google_compute_instance.vm[0] (remote-exec): cuda-repo-u 100% 1.73G 276MB/s in 6.4s
google_compute_instance.vm[0] (remote-exec): 2020-11-03 20:04:17 (276 MB/s) - ‘cuda-repo-ubuntu1804-10-1-local-10.1.243-418.87.00_1.0-1_amd64.deb’ saved [1859785444/1859785444]
google_compute_instance.vm[0]: Still creating... [2m10s elapsed]
google_compute_instance.vm[0] (remote-exec): Warning: apt-key output should not be parsed (stdout is not a terminal)
google_compute_instance.vm[0]: Still creating... [2m20s elapsed]
google_compute_instance.vm[0]: Still creating... [2m30s elapsed]
google_compute_instance.vm[0]: Still creating... [2m40s elapsed]
google_compute_instance.vm[0] (remote-exec): Extracting templates from packages: 11%
google_compute_instance.vm[0] (remote-exec): Extracting templates from packages: 22%
google_compute_instance.vm[0] (remote-exec): Extracting templates from packages: 33%
google_compute_instance.vm[0] (remote-exec): Extracting templates from packages: 45%
google_compute_instance.vm[0] (remote-exec): Extracting templates from packages: 56%
google_compute_instance.vm[0] (remote-exec): Extracting templates from packages: 67%
google_compute_instance.vm[0] (remote-exec): Extracting templates from packages: 79%
google_compute_instance.vm[0] (remote-exec): Extracting templates from packages: 90%
google_compute_instance.vm[0] (remote-exec): Extracting templates from packages: 100%
google_compute_instance.vm[0]: Still creating... [2m50s elapsed]
google_compute_instance.vm[0]: Still creating... [3m0s elapsed]
google_compute_instance.vm[0]: Still creating... [3m10s elapsed]
google_compute_instance.vm[0]: Still creating... [3m20s elapsed]
google_compute_instance.vm[0]: Still creating... [3m30s elapsed]
google_compute_instance.vm[0]: Still creating... [3m40s elapsed]
google_compute_instance.vm[0]: Still creating... [3m50s elapsed]
google_compute_instance.vm[0]: Still creating... [4m0s elapsed]
google_compute_instance.vm[0]: Still creating... [4m10s elapsed]
google_compute_instance.vm[0]: Still creating... [4m20s elapsed]
google_compute_instance.vm[0]: Still creating... [4m30s elapsed]
google_compute_instance.vm[0]: Still creating... [4m40s elapsed]
google_compute_instance.vm[0]: Still creating... [4m50s elapsed]
google_compute_instance.vm[0]: Still creating... [5m0s elapsed]
google_compute_instance.vm[0]: Still creating... [5m10s elapsed]
google_compute_instance.vm[0]: Still creating... [5m20s elapsed]
google_compute_instance.vm[0]: Still creating... [5m30s elapsed]
google_compute_instance.vm[0]: Still creating... [5m40s elapsed]
google_compute_instance.vm[0]: Still creating... [5m50s elapsed]
google_compute_instance.vm[0]: Still creating... [6m0s elapsed]
google_compute_instance.vm[0] (remote-exec): --2020-11-03 20:08:26-- http://developer.download.nvidia.com/compute/machine-learning/repos/ubuntu1804/x86_64/libcudnn7_7.6.5.32-1+cuda10.1_amd64.deb
google_compute_instance.vm[0] (remote-exec): Resolving developer.download.nvidia.com (developer.download.nvidia.com)... 152.195.19.142
google_compute_instance.vm[0] (remote-exec): Connecting to developer.download.nvidia.com (developer.download.nvidia.com)|152.195.19.142|:80... connected.
google_compute_instance.vm[0] (remote-exec): HTTP request sent, awaiting response... 200 OK
google_compute_instance.vm[0] (remote-exec): Length: 182313188 (174M) [application/x-deb]
google_compute_instance.vm[0] (remote-exec): Saving to: ‘libcudnn7_7.6.5.32-1+cuda10.1_amd64.deb’
google_compute_instance.vm[0] (remote-exec): libcu 0% 0 --.-KB/s
google_compute_instance.vm[0] (remote-exec): libcud 21% 37.97M 190MB/s
google_compute_instance.vm[0] (remote-exec): libcudn 54% 94.67M 237MB/s
google_compute_instance.vm[0]: Still creating... [6m10s elapsed]
google_compute_instance.vm[0] (remote-exec): libcudnn 86% 150.36M 250MB/s
google_compute_instance.vm[0] (remote-exec): libcudnn7_7 100% 173.87M 255MB/s in 0.7s
google_compute_instance.vm[0] (remote-exec): 2020-11-03 20:08:27 (255 MB/s) - ‘libcudnn7_7.6.5.32-1+cuda10.1_amd64.deb’ saved [182313188/182313188]
google_compute_instance.vm[0]: Still creating... [6m20s elapsed]
google_compute_instance.vm[0] (remote-exec): --2020-11-03 20:08:45-- http://developer.download.nvidia.com/compute/machine-learning/repos/ubuntu1804/x86_64/libcudnn7-dev_7.6.5.32-1+cuda10.1_amd64.deb
google_compute_instance.vm[0] (remote-exec): Resolving developer.download.nvidia.com (developer.download.nvidia.com)... 152.195.19.142
google_compute_instance.vm[0] (remote-exec): Connecting to developer.download.nvidia.com (developer.download.nvidia.com)|152.195.19.142|:80... connected.
google_compute_instance.vm[0] (remote-exec): HTTP request sent, awaiting response... 200 OK
google_compute_instance.vm[0] (remote-exec): Length: 160506208 (153M) [application/x-deb]
google_compute_instance.vm[0] (remote-exec): Saving to: ‘libcudnn7-dev_7.6.5.32-1+cuda10.1_amd64.deb’
google_compute_instance.vm[0] (remote-exec): libcu 0% 0 --.-KB/s
google_compute_instance.vm[0] (remote-exec): libcud 23% 35.27M 176MB/s
google_compute_instance.vm[0] (remote-exec): libcudn 59% 91.42M 229MB/s
google_compute_instance.vm[0] (remote-exec): libcudnn 96% 147.07M 245MB/s
google_compute_instance.vm[0] (remote-exec): libcudnn7-d 100% 153.07M 246MB/s in 0.6s
google_compute_instance.vm[0] (remote-exec): 2020-11-03 20:08:45 (246 MB/s) - ‘libcudnn7-dev_7.6.5.32-1+cuda10.1_amd64.deb’ saved [160506208/160506208]
google_compute_instance.vm[0]: Still creating... [6m30s elapsed]
google_compute_instance.vm[0]: Still creating... [6m40s elapsed]
google_compute_instance.vm[0]: Still creating... [6m50s elapsed]
google_compute_instance.vm[0] (remote-exec): Extracting templates from packages: 13%
google_compute_instance.vm[0] (remote-exec): Extracting templates from packages: 26%
google_compute_instance.vm[0] (remote-exec): Extracting templates from packages: 40%
google_compute_instance.vm[0] (remote-exec): Extracting templates from packages: 53%
google_compute_instance.vm[0] (remote-exec): Extracting templates from packages: 67%
google_compute_instance.vm[0] (remote-exec): Extracting templates from packages: 80%
google_compute_instance.vm[0] (remote-exec): Extracting templates from packages: 94%
google_compute_instance.vm[0] (remote-exec): Extracting templates from packages: 100%
google_compute_instance.vm[0]: Still creating... [7m0s elapsed]
google_compute_instance.vm[0]: Still creating... [7m10s elapsed]
google_compute_instance.vm[0]: Still creating... [7m20s elapsed]
google_compute_instance.vm[0]: Still creating... [7m30s elapsed]
google_compute_instance.vm[0] (remote-exec): WARNING: pip is being invoked by an old script wrapper. This will fail in a future version of pip.
google_compute_instance.vm[0] (remote-exec): Please see https://github.com/pypa/pip/issues/5599 for advice on fixing the underlying issue.
google_compute_instance.vm[0] (remote-exec): To avoid this problem you can invoke Python with '-m pip' instead of running pip directly.
google_compute_instance.vm[0] (remote-exec): WARNING: The scripts easy_install and easy_install-3.6 are installed in '/home/cfurtado/.local/bin' which is not on PATH.
google_compute_instance.vm[0] (remote-exec): Consider adding this directory to PATH or, if you prefer to suppress this warning, use --no-warn-script-location.
google_compute_instance.vm[0] (remote-exec): WARNING: pip is being invoked by an old script wrapper. This will fail in a future version of pip.
google_compute_instance.vm[0] (remote-exec): Please see https://github.com/pypa/pip/issues/5599 for advice on fixing the underlying issue.
google_compute_instance.vm[0] (remote-exec): To avoid this problem you can invoke Python with '-m pip' instead of running pip directly.
google_compute_instance.vm[0] (remote-exec): WARNING: Skipping crcmod as it is not installed.
google_compute_instance.vm[0] (remote-exec): WARNING: pip is being invoked by an old script wrapper. This will fail in a future version of pip.
google_compute_instance.vm[0] (remote-exec): Please see https://github.com/pypa/pip/issues/5599 for advice on fixing the underlying issue.
google_compute_instance.vm[0] (remote-exec): To avoid this problem you can invoke Python with '-m pip' instead of running pip directly.
google_compute_instance.vm[0] (remote-exec): WARNING: pip is being invoked by an old script wrapper. This will fail in a future version of pip.
google_compute_instance.vm[0] (remote-exec): Please see https://github.com/pypa/pip/issues/5599 for advice on fixing the underlying issue.
google_compute_instance.vm[0] (remote-exec): To avoid this problem you can invoke Python with '-m pip' instead of running pip directly.
google_compute_instance.vm[0]: Still creating... [7m40s elapsed]
google_compute_instance.vm[0] (remote-exec): WARNING: pip is being invoked by an old script wrapper. This will fail in a future version of pip.
google_compute_instance.vm[0] (remote-exec): Please see https://github.com/pypa/pip/issues/5599 for advice on fixing the underlying issue.
google_compute_instance.vm[0] (remote-exec): To avoid this problem you can invoke Python with '-m pip' instead of running pip directly.
google_compute_instance.vm[0]: Still creating... [7m50s elapsed]
google_compute_instance.vm[0]: Still creating... [8m0s elapsed]
google_compute_instance.vm[0]: Still creating... [8m10s elapsed]
google_compute_instance.vm[0] (remote-exec): WARNING: The scripts f2py, f2py3 and f2py3.6 are installed in '/home/cfurtado/.local/bin' which is not on PATH.
google_compute_instance.vm[0] (remote-exec): Consider adding this directory to PATH or, if you prefer to suppress this warning, use --no-warn-script-location.
google_compute_instance.vm[0] (remote-exec): WARNING: The script markdown_py is installed in '/home/cfurtado/.local/bin' which is not on PATH.
google_compute_instance.vm[0] (remote-exec): Consider adding this directory to PATH or, if you prefer to suppress this warning, use --no-warn-script-location.
google_compute_instance.vm[0] (remote-exec): WARNING: The scripts pyrsa-decrypt, pyrsa-encrypt, pyrsa-keygen, pyrsa-priv2pub, pyrsa-sign and pyrsa-verify are installed in '/home/cfurtado/.local/bin' which is not on PATH.
google_compute_instance.vm[0] (remote-exec): Consider adding this directory to PATH or, if you prefer to suppress this warning, use --no-warn-script-location.
google_compute_instance.vm[0] (remote-exec): WARNING: The script google-oauthlib-tool is installed in '/home/cfurtado/.local/bin' which is not on PATH.
google_compute_instance.vm[0] (remote-exec): Consider adding this directory to PATH or, if you prefer to suppress this warning, use --no-warn-script-location.
google_compute_instance.vm[0] (remote-exec): WARNING: The script tensorboard is installed in '/home/cfurtado/.local/bin' which is not on PATH.
google_compute_instance.vm[0] (remote-exec): Consider adding this directory to PATH or, if you prefer to suppress this warning, use --no-warn-script-location.
google_compute_instance.vm[0]: Still creating... [8m20s elapsed]
google_compute_instance.vm[0] (remote-exec): WARNING: The scripts estimator_ckpt_converter, saved_model_cli, tensorboard, tf_upgrade_v2, tflite_convert, toco and toco_from_protos are installed in '/home/cfurtado/.local/bin' which is not on PATH.
google_compute_instance.vm[0] (remote-exec): Consider adding this directory to PATH or, if you prefer to suppress this warning, use --no-warn-script-location.
google_compute_instance.vm[0] (remote-exec): WARNING: The scripts lsm2bin and tifffile are installed in '/home/cfurtado/.local/bin' which is not on PATH.
google_compute_instance.vm[0] (remote-exec): Consider adding this directory to PATH or, if you prefer to suppress this warning, use --no-warn-script-location.
google_compute_instance.vm[0]: Still creating... [8m30s elapsed]
google_compute_instance.vm[0] (remote-exec): WARNING: The scripts imageio_download_bin and imageio_remove_bin are installed in '/home/cfurtado/.local/bin' which is not on PATH.
google_compute_instance.vm[0] (remote-exec): Consider adding this directory to PATH or, if you prefer to suppress this warning, use --no-warn-script-location.
google_compute_instance.vm[0] (remote-exec): WARNING: The script skivi is installed in '/home/cfurtado/.local/bin' which is not on PATH.
google_compute_instance.vm[0] (remote-exec): Consider adding this directory to PATH or, if you prefer to suppress this warning, use --no-warn-script-location.
google_compute_instance.vm[0] (remote-exec): WARNING: The script pygmentize is installed in '/home/cfurtado/.local/bin' which is not on PATH.
google_compute_instance.vm[0] (remote-exec): Consider adding this directory to PATH or, if you prefer to suppress this warning, use --no-warn-script-location.
google_compute_instance.vm[0] (remote-exec): WARNING: The scripts iptest, iptest3, ipython and ipython3 are installed in '/home/cfurtado/.local/bin' which is not on PATH.
google_compute_instance.vm[0] (remote-exec): Consider adding this directory to PATH or, if you prefer to suppress this warning, use --no-warn-script-location.
google_compute_instance.vm[0] (remote-exec): WARNING: The scripts jupyter, jupyter-migrate and jupyter-troubleshoot are installed in '/home/cfurtado/.local/bin' which is not on PATH.
google_compute_instance.vm[0] (remote-exec): Consider adding this directory to PATH or, if you prefer to suppress this warning, use --no-warn-script-location.
google_compute_instance.vm[0] (remote-exec): WARNING: The scripts jupyter-kernel, jupyter-kernelspec and jupyter-run are installed in '/home/cfurtado/.local/bin' which is not on PATH.
google_compute_instance.vm[0] (remote-exec): Consider adding this directory to PATH or, if you prefer to suppress this warning, use --no-warn-script-location.
google_compute_instance.vm[0] (remote-exec): WARNING: The script jupyter-trust is installed in '/home/cfurtado/.local/bin' which is not on PATH.
google_compute_instance.vm[0] (remote-exec): Consider adding this directory to PATH or, if you prefer to suppress this warning, use --no-warn-script-location.
google_compute_instance.vm[0]: Still creating... [8m40s elapsed]
google_compute_instance.vm[0] (remote-exec): WARNING: The script jupyter-nbconvert is installed in '/home/cfurtado/.local/bin' which is not on PATH.
google_compute_instance.vm[0] (remote-exec): Consider adding this directory to PATH or, if you prefer to suppress this warning, use --no-warn-script-location.
google_compute_instance.vm[0] (remote-exec): WARNING: The scripts jupyter-bundlerextension, jupyter-nbextension, jupyter-notebook and jupyter-serverextension are installed in '/home/cfurtado/.local/bin' which is not on PATH.
google_compute_instance.vm[0] (remote-exec): Consider adding this directory to PATH or, if you prefer to suppress this warning, use --no-warn-script-location.
google_compute_instance.vm[0] (remote-exec): WARNING: The script jupyter-console is installed in '/home/cfurtado/.local/bin' which is not on PATH.
google_compute_instance.vm[0] (remote-exec): Consider adding this directory to PATH or, if you prefer to suppress this warning, use --no-warn-script-location.
google_compute_instance.vm[0] (remote-exec): ERROR: After October 2020 you may experience errors when installing or updating packages. This is because pip will change the way that it resolves dependency conflicts.
google_compute_instance.vm[0] (remote-exec): We recommend you use --use-feature=2020-resolver to test your packages with the new resolver before it becomes the default.
google_compute_instance.vm[0] (remote-exec): tensorflow-gpu 2.3.1 requires numpy<1.19.0,>=1.16.0, but you'll have numpy 1.19.4 which is incompatible.
google_compute_instance.vm[0]: Provisioning with 'remote-exec'...
google_compute_instance.vm[0] (remote-exec): Connecting to remote host via SSH...
google_compute_instance.vm[0] (remote-exec): Host: 34.74.210.177
google_compute_instance.vm[0] (remote-exec): User: cfurtado
google_compute_instance.vm[0] (remote-exec): Password: false
google_compute_instance.vm[0] (remote-exec): Private key: true
google_compute_instance.vm[0] (remote-exec): Certificate: false
google_compute_instance.vm[0] (remote-exec): SSH Agent: false
google_compute_instance.vm[0] (remote-exec): Checking Host Key: false
google_compute_instance.vm[0] (remote-exec): Connected!
google_compute_instance.vm[0]: Creation complete after 8m43s [id=projects/necstlab/zones/us-east1-c/instances/cfurtado-necstlab-0]
Apply complete! Resources: 1 added, 0 changed, 0 destroyed.
dlerror: libcudart.so.10.1
2020-11-02 20:06:09.697596: W tensorflow/stream_executor/platform/default/dso_loader.cc:59] Could not load dynamic library 'libcudart.so.10.1'; dlerror: libcudart.so.10.1: cannot
open shared object file: No such file or directory
2020-11-02 20:06:09.697641: I tensorflow/stream_executor/cuda/cudart_stub.cc:29] Ignore above cudart dlerror if you do not have a GPU set up on your machine.
$nvidia-smi
NVIDIA-SMI has failed because it couldn't communicate with the NVIDIA driver. Make sure that the latest NVIDIA driver is installed and running.
cuda version - ok
$ /usr/local/cuda/bin/nvcc --version
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2019 NVIDIA Corporation
Built on Sun_Jul_28_19:07:16_PDT_2019
Cuda compilation tools, release 10.1, V10.1.243
ubunto version - ok
$lsb_release -a
No LSB modules are available.
Distributor ID: Ubuntu
Description: Ubuntu 18.04.5 LTS
Release: 18.04
Codename: bionic
python version
$ python3 --version
Python 3.6.9
cudnn version - ok 7.6.5:
$ cat /usr/include/x86_64-linux-gnu/cudnn_v*.h | grep CUDNN_MAJOR -A 2
#define CUDNN_MAJOR 7
#define CUDNN_MINOR 6
#define CUDNN_PATCHLEVEL 5
--
#define CUDNN_VERSION (CUDNN_MAJOR * 1000 + CUDNN_MINOR * 100 + CUDNN_PATCHLEVEL)
#include "driver_types.h"
ubuntu (and python) 20.x, pip >=19, tensorflow 2.3