Since we define our var overrides via extra vars with a dictionary,
Ansible's default behaviour is to wholesale swap the dictionaries,
seemingly. This causes an issue when you try to use one or the other of
groups vs fleets. Due to the logic we have to figure out if the user is
requesting an instance group-backed vs instance fleet-backed cluster, we
would potentially launch with a default group-based setup (ignoring the
fleet config) or with an erroneous string for the fleet config.
Basically, it's broken either way since we can't run 99% of our jobs on
an m1.medium. From a technical standpoint, though, the cluster could
successfully launch and schedule Hadoop jobs. They would just
inevitably fail from OOM at some point.
This change properly brings instance_fleets back into the fold by
passing it in, but changes the hash behaviour so that
instance_groups/instance_fleets are always both defined, as they should
be, with the correct defaults to properly trigger the detection logic.
Since we define our var overrides via extra vars with a dictionary, Ansible's default behaviour is to wholesale swap the dictionaries, seemingly. This causes an issue when you try to use one or the other of groups vs fleets. Due to the logic we have to figure out if the user is requesting an instance group-backed vs instance fleet-backed cluster, we would potentially launch with a default group-based setup (ignoring the fleet config) or with an erroneous string for the fleet config. Basically, it's broken either way since we can't run 99% of our jobs on an m1.medium. From a technical standpoint, though, the cluster could successfully launch and schedule Hadoop jobs. They would just inevitably fail from OOM at some point.
This change properly brings instance_fleets back into the fold by passing it in, but changes the hash behaviour so that instance_groups/instance_fleets are always both defined, as they should be, with the correct defaults to properly trigger the detection logic.