nchammas / flintrock

A command-line tool for launching Apache Spark clusters.
Apache License 2.0
638 stars 116 forks source link

Support launching clusters running Ubuntu #95

Open ypopkov opened 8 years ago

ypopkov commented 8 years ago

Is this a known issue? (I am spinning up a Ubuntu 14.04.4 LTS AMI from mac os) Thanks

$ flintrock launch test-cluster
Requesting 2 spot instances at a max price of $1.0...
0 of 2 instances granted. Waiting...
0 of 2 instances granted. Waiting...
All 2 instances granted.
[54.211.42.46] SSH online.
[54.161.133.64] SSH online.
[54.211.42.46] Configuring ephemeral storage...
[54.161.133.64] Configuring ephemeral storage...
At least one node raised an error: lsblk: unrecognized option '--paths'

Usage:
 lsblk [options] [<device> ...]

Options:
 -a, --all            print all devices
 -b, --bytes          print SIZE in bytes rather than in human readable format
 -d, --nodeps         don't print slaves or holders
 -D, --discard        print discard capabilities
 -e, --exclude <list> exclude devices by major number (default: RAM disks)
 -f, --fs             output info about filesystems
 -h, --help           usage information (this)
 -i, --ascii          use ascii characters only
 -m, --perms          output info about permissions
 -l, --list           use list format ouput
 -n, --noheadings     don't print headings
 -o, --output <list>  output columns
 -P, --pairs          use key="value" output format
 -r, --raw            use raw output format
 -t, --topology       output info about topology

Available columns:
       NAME  device name
      KNAME  internal kernel device name
    MAJ:MIN  major:minor device number
     FSTYPE  filesystem type
 MOUNTPOINT  where the device is mounted
      LABEL  filesystem LABEL
       UUID  filesystem UUID
         RO  read-only device
         RM  removable device
      MODEL  device identifier
       SIZE  size of the device
      STATE  state of the device
      OWNER  user name
      GROUP  group name
       MODE  device node permissions
  ALIGNMENT  alignment offset
     MIN-IO  minimum I/O size
     OPT-IO  optimal I/O size
    PHY-SEC  physical sector size
    LOG-SEC  logical sector size
       ROTA  rotational device
      SCHED  I/O scheduler name
    RQ-SIZE  request queue size
       TYPE  device type
   DISC-ALN  discard alignment offset
  DISC-GRAN  discard granularity
   DISC-MAX  discard max bytes
  DISC-ZERO  discard zeroes data

For more information see lsblk(8).
Traceback (most recent call last):
  File "/tmp/setup-ephemeral-storage.py", line 158, in <module>
    non_root_block_devices = get_non_root_block_devices()
  File "/tmp/setup-ephemeral-storage.py", line 63, in get_non_root_block_devices
    '--noheadings'])
  File "/usr/lib/python2.7/subprocess.py", line 573, in check_output
    raise CalledProcessError(retcode, cmd, output=output)
subprocess.CalledProcessError: Command '['lsblk', '--ascii', '--paths', '--output', 'KNAME,MOUNTPOINT', '--inverse', '--nodeps', '--noheadings']' returned non-zero exit status 1
Do you want to terminate the 2 instances created by this operation? [Y/n]: 
nchammas commented 8 years ago

What you're seeing here is most likely the result of lsblk having different options on Ubuntu vs. Amazon Linux.

Flintrock is currently developed and tested against Amazon Linux and, to a lesser extent, CentOS.

With some work we might be able to get Flintrock to work against Ubuntu and other distributions, but that's not on the roadmap at the moment. I am open to proposals on how support a popular distribution like Ubuntu as long as they don't meaningfully increase Flintrock's complexity or maintenance burden.

Since Fedora and Red Hat are much more similar to Amazon Linux and CentOS, you should have an easier time getting Flintrock to work with those distributions.

ypopkov commented 8 years ago

thanks. btw, it looks like "--paths" is an option on newer versions of ubuntu: http://manpages.ubuntu.com/manpages/trusty/man8/lsblk.8.html

nchammas commented 8 years ago

Good catch. (I assume you meant to link to Wily or Xenial, not Trusty.)

Maybe things will work with Ubuntu 15? Though I wouldn't be surprised if you then hit something else...

For example, in some cases Flintrock uses yum to install stuff (ex1, ex2), but you may not hit those if you install a release version of Spark and Java is already installed on the image.

ypopkov commented 8 years ago

my devops tell me I should stay with LTS - and who am I to argue with devops? ;)

ypopkov commented 8 years ago

to improve portability, one could move system-dependent stuff like ephemerals setup to the instance user_data/boopstrap scripts and/or flintrock plugins (similar to how StarCluster does it) so that end-user is responsible for low level bootstrap by either customizing user data script or implementing a plugin

nchammas commented 8 years ago

I've never built a plugin system, and I'm not too familiar with StarCluster (I tinkered with it once a while back), so at this time I can't comment authoritatively on whether this is a good idea or not.

I can say though that I'd always want Flintrock to work extremely well out-of-the-box, and expose any additional complexity (e.g. via a plugin system) only to those users who ask for it.

As a sort of case study, figuring out what it takes to support Ubuntu may give us a better idea of how to approach this problem.

ypopkov commented 8 years ago

Happy to report that I was able to spin up a two node cluster using ubuntu 16.04.beta2LTS AMI (with spark pre-installed): the instance setup went fine (see below), but start-master.sh and start-slave.sh did not succeed (probably because of "spark/sbin" path) but I was able to start them manually.

One approach for fixing this issue is to add Spark's sbin directory to PATH so one can call those scripts directly by names.

$flintrock launch test-cluster
Requesting 2 spot instances at a max price of $1.0...
0 of 2 instances granted. Waiting...
All 2 instances granted.
[54.160.196.222] SSH online.
[54.82.43.227] SSH online.
[54.82.43.227] Configuring ephemeral storage...
[54.160.196.222] Configuring ephemeral storage...
launch finished in 0:02:33.
nchammas commented 8 years ago

Hmm, why do you think there is a problem with the sbin path? That part should work regardless of the Linux distribution, so I would assume it's something else interfering with the master/slave startup.