Azure / batch-shipyard

Simplify HPC and Batch workloads on Azure
MIT License
277 stars 121 forks source link

Cryptography installation error in resource monitoring VM #362

Open jscarlson opened 3 years ago

jscarlson commented 3 years ago

Problem Description

I'm attempting to set up real time monitoring of my Azure Batch pool as described in the Batch Shipyard docs. After following the steps in the in-depth guide to resource monitoring (i.e., updating my configuration YAML files in the ways described in this guide), I ran shipyard monitor create and ran into an error detailed below.

Batch Shipyard Version

3.9.1 (most recent)

Steps to Reproduce

Simply running shipyard monitor create with my configuration files.

Expected Results

Message indicating successful provisioning process.

Actual Results

I get the following error message, which I've included with some redactions.

[REDACTED] ERROR - Ran out of retry attempts invoking _create_virtual_machine_extension([REDACTED]) status_code=200
Traceback (most recent call last):
  File "/home/[REDACTED]/.local/lib/python3.8/site-packages/msrestazure/polling/arm_polling.py", line 390, in run
    self._poll()
  File "/home/[REDACTED]/.local/lib/python3.8/site-packages/msrestazure/polling/arm_polling.py", line 418, in _poll
    raise OperationFailed("Operation failed or cancelled")
msrestazure.polling.arm_polling.OperationFailed: Operation failed or cancelled

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/home/[REDACTED]/batch-shipyard/shipyard.py", line 3136, in <module>
    cli()
  File "/home/[REDACTED]/.local/lib/python3.8/site-packages/click/core.py", line 764, in __call__
    return self.main(*args, **kwargs)
  File "/home/[REDACTED]/.local/lib/python3.8/site-packages/click/core.py", line 717, in main
    rv = self.invoke(ctx)
  File "/home/[REDACTED]/.local/lib/python3.8/site-packages/click/core.py", line 1137, in invoke
    return _process_result(sub_ctx.command.invoke(sub_ctx))
  File "/home/[REDACTED]/.local/lib/python3.8/site-packages/click/core.py", line 1137, in invoke
    return _process_result(sub_ctx.command.invoke(sub_ctx))
  File "/home/[REDACTED]/.local/lib/python3.8/site-packages/click/core.py", line 956, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "/home/[REDACTED]/.local/lib/python3.8/site-packages/click/core.py", line 555, in invoke
    return callback(*args, **kwargs)
  File "/home/[REDACTED]/.local/lib/python3.8/site-packages/click/decorators.py", line 64, in new_func
    return ctx.invoke(f, obj, *args, **kwargs)
  File "/home/[REDACTED]/.local/lib/python3.8/site-packages/click/core.py", line 555, in invoke
    return callback(*args, **kwargs)
  File "/home/[REDACTED]/batch-shipyard/shipyard.py", line 2432, in monitor_create
    convoy.fleet.action_monitor_create(
  File "/home/[REDACTED]/batch-shipyard/convoy/fleet.py", line 4614, in action_monitor_create
    monitor.create_monitoring_resource(
  File "/home/[REDACTED]/batch-shipyard/convoy/monitor.py", line 441, in create_monitoring_resource
    vm_ext = async_ops['vmext'][offset].result()
  File "/home/[REDACTED]/batch-shipyard/convoy/resource.py", line 104, in result
    return self._op.result()
  File "/home/[REDACTED]/.local/lib/python3.8/site-packages/msrest/polling/poller.py", line 183, in result
    self.wait(timeout)
  File "/home/[REDACTED]/.local/lib/python3.8/site-packages/msrest/polling/poller.py", line 201, in wait
    raise self._exception  # type: ignore
  File "/home/[REDACTED]/.local/lib/python3.8/site-packages/msrest/polling/poller.py", line 152, in _start
    self._polling_method.run()
  File "/home/[REDACTED]/.local/lib/python3.8/site-packages/msrestazure/polling/arm_polling.py", line 400, in run
    raise CloudError(self._response)
msrestazure.azure_exceptions.CloudError: Azure Error: VMExtensionProvisioningError
Message: VM has reported a failure when processing extension '[REDACTED]-vmext000'. Error message: "Enable failed: failed to execute command: command terminated with exit status=1
[stdout]
et-client<1,>=0.32.0 (from docker-compose)
  Downloading https://files.pythonhosted.org/packages/08/33/80e0d4f60e84a1ddd9a03f340be1065a2a363c47ce65c4bd3bae65ce9631/websocket_client-0.58.0-py2.py3-none-any.whl (61kB)
Collecting docker[ssh]<5,>=4.4.4 (from docker-compose)
  Downloading https://files.pythonhosted.org/packages/c4/22/410313ad554477e87ec406d38d85f810e61ddb0d2fc44e64994857476de9/docker-4.4.4-py2.py3-none-any.whl (147kB)
Requirement already satisfied: PyYAML<6,>=3.10 in /usr/lib/python3/dist-packages (from docker-compose)
Collecting cached-property<2,>=1.2.0 (from docker-compose)
  Downloading https://files.pythonhosted.org/packages/48/19/f2090f7dad41e225c7f2326e4cfe6fff49e57dedb5b53636c9551f86b069/cached_property-1.5.2-py2.py3-none-any.whl
Collecting distro<2,>=1.5.0 (from docker-compose)
  Downloading https://files.pythonhosted.org/packages/25/b7/b3c4270a11414cb22c6352ebc7a83aaa3712043be29daa05018fd5a5c956/distro-1.5.0-py2.py3-none-any.whl
Requirement already satisfied: jsonschema<4,>=2.5.1 in /usr/lib/python3/dist-packages (from docker-compose)
Collecting texttable<2,>=0.9.0 (from docker-compose)
  Downloading https://files.pythonhosted.org/packages/06/f5/46201c428aebe0eecfa83df66bf3e6caa29659dbac5a56ddfd83cae0d4a4/texttable-1.6.3-py2.py3-none-any.whl
Collecting requests<3,>=2.20.0 (from docker-compose)
  Downloading https://files.pythonhosted.org/packages/29/c1/24814557f1d22c56d50280771a17307e6bf87b70727d975fd6b2ce6b014a/requests-2.25.1-py2.py3-none-any.whl (61kB)
Collecting dockerpty<1,>=0.4.1 (from docker-compose)
  Downloading https://files.pythonhosted.org/packages/8d/ee/e9ecce4c32204a6738e0a5d5883d3413794d7498fe8b06f44becc028d3ba/dockerpty-0.4.1.tar.gz
Collecting docopt<1,>=0.6.1 (from docker-compose)
  Downloading https://files.pythonhosted.org/packages/a2/55/8f8cab2afd404cf578136ef2cc5dfb50baa1761b68c9da1fb1e4eed343c9/docopt-0.6.2.tar.gz
Collecting python-dotenv<1,>=0.13.0 (from docker-compose)
  Downloading https://files.pythonhosted.org/packages/32/2e/e4585559237787966aad0f8fd0fc31df1c4c9eb0e62de458c5b6cde954eb/python_dotenv-0.15.0-py2.py3-none-any.whl
Requirement already satisfied: six in /usr/lib/python3/dist-packages (from websocket-client<1,>=0.32.0->docker-compose)
Collecting paramiko>=2.4.2; extra == "ssh" (from docker[ssh]<5,>=4.4.4->docker-compose)
  Downloading https://files.pythonhosted.org/packages/95/19/124e9287b43e6ff3ebb9cdea3e5e8e88475a873c05ccdf8b7e20d2c4201e/paramiko-2.7.2-py2.py3-none-any.whl (206kB)
Requirement already satisfied: idna<3,>=2.5 in /usr/lib/python3/dist-packages (from requests<3,>=2.20.0->docker-compose)
Requirement already satisfied: urllib3<1.27,>=1.21.1 in /usr/lib/python3/dist-packages (from requests<3,>=2.20.0->docker-compose)
Requirement already satisfied: chardet<5,>=3.0.2 in /usr/lib/python3/dist-packages (from requests<3,>=2.20.0->docker-compose)
Requirement already satisfied: certifi>=2017.4.17 in /usr/lib/python3/dist-packages (from requests<3,>=2.20.0->docker-compose)
Collecting cryptography>=2.5 (from paramiko>=2.4.2; extra == "ssh"->docker[ssh]<5,>=4.4.4->docker-compose)
  Downloading https://files.pythonhosted.org/packages/fa/2d/2154d8cb773064570f48ec0b60258a4522490fcb115a6c7c9423482ca993/cryptography-3.4.6.tar.gz (546kB)
    Complete output from command python setup.py egg_info:
    Traceback (most recent call last):
      File "<string>", line 1, in <module>
      File "/tmp/pip-build-17v_0468/cryptography/setup.py", line 14, in <module>
        from setuptools_rust import RustExtension
    ModuleNotFoundError: No module named 'setuptools_rust'

            =============================DEBUG ASSISTANCE==========================
            If you are seeing an error here please try the following to
            successfully install cryptography:

            Upgrade to the latest pip and try again. This will fix errors for most
            users. See: https://pip.pypa.io/en/stable/installing/#upgrading-pip
            =============================DEBUG ASSISTANCE==========================

    ----------------------------------------

[stderr]
Warning: apt-key output should not be parsed (stdout is not a terminal)
Synchronizing state of docker.service with SysV service script with /lib/systemd/systemd-sysv-install.
Executing: /lib/systemd/systemd-sysv-install enable docker
WARNING: No swap limit support
Command "python setup.py egg_info" failed with error code 1 in /tmp/pip-build-17v_0468/cryptography/
"

More information on troubleshooting is available at https://aka.ms/VMExtensionCSELinuxTroubleshoot 

Redacted Configuration

monitor.yaml

monitoring:
  location: XXX
  resource_group: XXX
  hostname_prefix: XXX
  ssh:
    username: XXX
    ssh_public_key: XXX
    ssh_private_key: XXX
  public_ip:
    enabled: true
    static: false
  virtual_network:
    name: XXX
    resource_group: XXX
    existing_ok: false
    address_space: XXX
    subnet:
      name: XXX
      address_prefix: XXX
  network_security:
    ssh:
    - XXX
    grafana:
    - XXX
    - XXX
    prometheus:
    - XXX
  vm_size: XXX
  accelerated_networking: false
  services:
    resource_polling_interval: 15
    lets_encrypt:
      enabled: true
      use_staging_environment: true
    prometheus:
      port: XXX
      scrape_interval: 10s

Additional Logs

INSERT ADDITIONAL LOGS HERE

Additonal Comments

As the "DEBUG ASSISTANCE" suggests, it looks like there is a problem installing cryptography caused by an out-of-date pip. May the solution just be a pip install --upgrade pip being called somewhere in the source code that determines the start task of the monitoring VM (or inside the relevant docker image)?

Thanks in advance for the help!