GoogleCloudPlatform / deploymentmanager-samples

Deployment Manager samples and templates.
Apache License 2.0
940 stars 717 forks source link

lustre doesn't deploy if existing VPC is specified. #691

Open Tristan-Kosciuch opened 1 year ago

Tristan-Kosciuch commented 1 year ago

I don't believe lustre.jinja is configured to deploy lustre into an existing VPC. In my lustre.yaml I have this in my cluster config:

   ### Use these fields to deploy Lustre in an existing VPC, Subnet, and/or Shared VPC
    vpc_net                 : slurm-gcp-v5-net
    #vpc_subnet              : slurm-gcp-v5-primary-subnet
    #shared_vpc_host_proj    : < Shared VPC Host Project name >

slurm-gcp-v5-net exists and you can see I've commented out the vpc_subnet. The first error returned is "subnet_split is undefined". In line 23 of lustre.jinja, subnet_split is defined only if lustre is being deployed into a new VPC. I fixed this error by adding this to lustre.jinja:

{% if properties['vpc_net'] and not properties['vpc_subnet'] %}
{% set subnet = properties['cidr'].split('/')[0] %}
{% set subnet_split = subnet.split('.') %}
{% endif %}

However, now the rest of the code references the wrong VPC. For instance, in lustre.jinja to create new subnets:

{% if not properties['vpc_subnet'] %}
# Create a subnet for the new cluster
- name: {{properties["cluster_name"]}}-lustre-subnet
  type: compute.v1.subnetwork
  properties:
    network: $(ref.{{properties["cluster_name"]}}-lustre-network.selfLink)
    ipCidrRange: {{ properties["cidr"]}}
    region: {{ region_ext }}
    privateIpGoogleAccess: TRUE

Notice how network: will be properties["cluster_name"]-lustre-network, not["vpc_net"]. I think the script needs to be reworked for passing an existing VPC without specifying a subnet. I'll need to fix this so I can create a PR.

Update

I got the script working by adding this code before the resources block in lustre.jinja. Change to your project in GCP:

{% if properties['vpc_net'] %}
  {% set vpc_reference = 'https://www.googleapis.com/compute/v1/projects/<Project ID>/global/networks/' ~ properties["vpc_net"]  %}
{% endif %}

This code in the resources section

# create subnet_split if using existing VPC
{% if properties['vpc_net'] and not properties['vpc_subnet'] %}
{% set subnet = properties['cidr'].split('/')[0] %}
{% set subnet_split = subnet.split('.') %}
{% endif %}

and by changing network references from network: $(ref.{{properties["cluster_name"]}}-lustre-network.selfLink) to network: {{ vpc_reference }}