User data scripts aren't executed when adding slaves

dhorgan commented 4 years ago

I think I've found a bug in the add-slaves command (or maybe I'm just misunderstanding how it's supposed to work). Here are my version specs:

Flintrock version: 1.0.0
Python version: 3.7.4
OS: MacOS Catalina 10.15.4 (19E266)

I create my cluster with a configuration file that looks like this:

services:
  spark:
    version: 2.4.5
  hdfs:
    version: 2.7.7

provider: ec2

providers:
  ec2:
    ...  # Omitting region, availability zone, etc here for brevity
    instance-type: m5.large
    ami: ami-0ce49ca477b768354
    user-data: scripts/install-python3.sh

launch:
  num-slaves: 2
  install-hdfs: True
  install-spark: True

debug: false

The user data script at scripts/install-python3.sh installs Python 3.7 and looks like this:

#!/usr/bin/env bash

set -e

sudo yum install -y python3 python3-devel python3-pip python3-setuptools python3-virtualenv

As expected, when I run flintrock launch the cluster is created and each node has Python 3 installed. I've verified this manually by SSH'ing into each node and checking.

Next, I add a single node by running flintrock add-slaves --num-slaves 1. When I log into the new node, Python 3.7 is unavailable. This is an unexpected result (in my opinion), but I'm not sure whether it's the intended behaviour or not so I'm reporting it here.

Thanks for the otherwise excellent tool! :)

nchammas commented 4 years ago

Hello @dhorgan. Happy to hear that Flintrock is useful to you.

Thank you for submitting a well-written report. I can confirm that Flintrock is not attaching the correct user data to new slaves when it adds them to an existing cluster.

I suspect something is going wrong here, probably with the encoding of the data: https://github.com/nchammas/flintrock/blob/52c6c84c9a1845b0ce89ca138172a6ec4cf0d632/flintrock/ec2.py#L281

nchammas commented 4 years ago

By the way, Flintrock's default (and my recommendation) is to use Hadoop 2.8.5.

nchammas / flintrock

User data scripts aren't executed when adding slaves #303