Open juanrh opened 8 years ago
See precedent for spark in http://forum.odroid.com/viewtopic.php?f=98&t=21369 and check
other interesting precedent http://climbers.net/sbc/40-core-arm-cluster-nanopc-t3/
According to https://ci.apache.org/projects/flink/flink-docs-release-1.1/setup/yarn_setup.html HDFS is required for running flink on YARN, because flink uses HDFS to distribute the über jar, just like mapreduce. Possible options
The first option sounds the best for a first approach. The first cluster should ideally have 1 cubox and 3 odroids to have one master and 2 slave nodes, but with 2 odroids we might have 4 containers of 700 MB approx
For Spark, if we have a separate head node, then the driver would run either in the head node (yarn client mode) or a container in a o-droid slave (yarn cluster mode). In any way the cubox would not be executing computations, so a proof of concept with the cubox as ResourceManager (compute master), NodeManager (data master) and also the only DataNode (data slave), still makes sense. Future setups could include:
Setup test cluster with 1 Cubox 4x4 as master and 2 ODROID C2 as slaves. Try run latest Hortonworks HDP on Ubuntu with just YARN installed, and just Ambari for the monitoring (no Ganglia or Nagios), i.e. with the following Ambari blueprint