Scripts for bootstrapping a local MarkLogic cluster for development purposes using Vagrant and VirtualBox.
By default these scripts create 3 'grtjn/centos-7.2' Vagrant VMs, running in VirtualBox. The names and ips will be recorded in /etc/hosts of host and VMs with use of vagrant-hostmanager. MarkLogic (including dependencies) will be installed on all three vms, and bootstrapped to form a cluster. The OS will be fully updated initially, and "Development Tools" installed as well. Zip/Unzip, Java, MLCP, Nodejs, Bower, Gulp, Forever, Ruby, Git, and Tomcat will be installed, and configured. A bare git repository will be prepared in /space/projects. All automatically with just a few commands.
Each VM takes roughly 2.5Gb. The VM template, together with 3 VMs will take about 10Gb of disk space. In addition, each VM that is launched will claim 2Gb of RAM, and 2 CPU cores. Make sure you have sufficient resources!
Special credits to @peetkes and @miguelrgonzalez for giving me a head start with this. Thanks to anyone else that has provided help or feedback!
Note: this project used to depend on chef/centos boxes, but they are no longer available. They have been 'moved' to bento, which only published latest release of each major version. I have recovered the chef base boxes from my local Vagrant cache, and republished on Atlas with my personal account: https://atlas.hashicorp.com/grtjn. I'll be using base boxes published there from now on.
You first need to download and install prerequisites and mlvagrant itself:
vagrant plugin install vagrant-hostmanager
vagrant plugin install vagrant-proxyconf
vagrant plugin install vagrant-vbguest
(Make sure your Vagrantfile has config.vbguest.no_install = true
!)/space/software
(For Windows: c:\space\software
):
sudo mkdir -p /space/software
sudo chmod 777 /space/software
/space/software
(no need to unzip MLCP!)git clone https://github.com/grtjn/mlvagrant.git
/opt/mlvagrant
(For Windows: c:\opt\mlvagrant
), if it doesn't exist yet:
sudo mkdir -p /opt/mlvagrant
sudo chmod 755 /opt/mlvagrant
mlvagrant/opt/mlvagrant/*
to /opt/mlvagrant/
IMPORTANT:
You will also need to get hold of a valid license key. Put the license key info in the appropriate ml license properties file in /opt/mlvagrant. You will need an Enterprise (Developer) license for setting up clusters. For project-specific licenses, copy these files next to project.properties first, and edit them there.
Above steps need to taken only once. For every project you wish to create VMs, you simply take these steps:
mlvagrant/project/Vagrantfile
to that foldermlvagrant/project/project.properties
to that foldervagrant up --no-provision
(may take a while depending on bandwidth, particularly first time)vagrant provision
(may take a while, enter sudo password when asked, to allow changing /etc/hosts)That is all that is necessary to create a fully-prepared 3-node MarkLogic cluster running on CentOS 7.2 VMs. It takes the name of the project folder as prefix for the host names, to make running projects in parallel easier. If you ran the above in a folder called 'vgtest', it will have created three nodes with the names:
To gain ssh access to the first, you do:
vagrant ssh vgtest-ml1
To take one host down you do:
vagrant halt vgtest-ml2
To take down all you do:
vagrant halt
To destroy all VMs (maybe to recreate them from scratch):
vagrant destroy
In the case you'd like to do an upgrade of MarkLogic accross your cluster, you can use:
vagrant reload
It will look for new MarkLogic and MLCP installers, and ask if you'd like to use different ones. After that it will stop all VMs, reload the Vagrantfile, and restart all VMs as usual with vagrant reload. In addition it will, stop MarkLogic services, remove rpm installations, and run the selected installer for MarkLogic. It will also install the new MLCP installer if changed.
To clean-install MarkLogic accross your cluster, and optionally upgrade/downgrade at the same time, you can use:
MLV_REMOVE_ML=1 vagrant reload
That will do the same as above, but additionally flush the MarkLogic data directories on all VMs, and effectively install from scratch. It will also rejoin all hosts in a new cluster.
The project.properties
file contains various settings, amongst others:
nr_hosts
, defaults to 3ml_version
, defaults to '10'The minimum number of hosts is 1, the maximum is limited mostly by the local resources you have available. Each vm will take 2.5Gb of disk space, and by default (also in the Vagrantfile) takes 2Gb of ram, and 2 CPU cores.
Note: although you can technically create a cluster of just 2 nodes, 3 nodes is required for proper fail-over. The cluster needs a quorum to vote if a host should be excluded.
The ml_version is used in the install-ml-centos.sh
script to select the appropriate installer. Code is in place to install versions 5, 6, 7, 8, 9, and 10. The install-ml script refers to latest rpms by exact name, which includes subversion number, and patch level. Use the ml_installer property to override with the exact version you prefer to install.
For the full list of settings see below..
Project name - defaults to current directory name
VM naming pattern - defaults to {project_name}-ml{i}, also allowed: {ml_version}
IMPORTANT: DON'T CHANGE ONCE YOU HAVE CREATED THE VM'S!!
CentOS base VM version - defaults to 7.2, allowed: 5.11/6.5/6.6/6.7/6.8/6.9/7.0/7.1/7.2
Note: MarkLogic 8+ does not support CentOS 5- Note: MarkLogic 9+ does not support CentOS 6- Note: MarkLogic 10+ seems to need latest CentOS 7, which is why OS is updated by default since mlvagrant 1.0.6
Major MarkLogic release to install - defaults to 10, allowed: 5,6,7,8,9,10 (installers need to be present)
Number of hosts in the cluster - defaults to 3, minimum for failover support
Memory assigned to master node in cluster (first vm) - defaults to 2048
Note: MarkLogic 9 EA3 requires at least 4Gb of memory.
Number of cpus assigned to master node in cluster (first vm) - defaults to 2
Memory assigned to each slave node in cluster - defaults to same as master_memory
Note: MarkLogic 9 EA3 requires at least 4Gb of memory.
Number of cpus assigned to each slave node in cluster - defaults to same as master_cpus
Name of public_network to use in Vagrant, for instance "en0: Wi-Fi (AirPort)" - defaults to ""
Note: enabling this makes your VMs accessible from outside, beware of security leaks
Assign dedicated private IP to master node - slaves get same ip + i
URL for a network proxy for FTP, HTTP, and HTTPS requests
Use of this setting requires installation of the vagrant-proxyconf
plugin
Hostnames or IP addresses that do not require use of the network proxy
Mount an extra folder from host on vm - project dir is automatically shared as /vagrant
Override hard-coded MarkLogic installers (file is searched in /space/software, or c:\space\software\ on Windows)
Override hard-coded MLCP installers (file is searched in /space/software, or c:\space\software\ on Windows)
Run full OS updates - defaults to true
Note: doing this with CentOS 6.5 or 7.0 will take it up to the very latest minor release (6.9+ resp 7.6+)
Install group "Development tools" - defaults to true
Install zip/unzip - defaults to true
Note: Zip/unzip not required for MLCP (provided through Java)
Install Java - defaults to true
Note: necessary for MLCP Note: installs JDK 8 currently
Install MarkLogic Content Pump - defaults to true
Note: installs an MLCP version that matches ml_version, unless an explicit mlcp_installer was specified Note: this will force installation of JDK 8, and unzip (unzip required for installation)
Install Node.js, npm - defaults to true
Install Long-Term Support version of Node.js (v10 currently) - defaults to true
Install Ruby - default to true
Note: Ruby is mostly already installed on CentOS, this is just to be certain. Note: CentOS 7 comes with Ruby v2 out of the box.
Install Git command-line tools - defaults to true
Initializes a bare Git repository under /space/projects, along with a user named {project_name} to use it Also installs legacy tooling for it, including bower, gulp, forever (globally)
Install PM2 NodeJs Process Manager, for running NodeJs services - defaults to true
Note: this will force installation of Git v2. Note: pm2-logrotate was added by default since 1.0.6
Install and enable HTTPD service - defaults to true
Note: HTTPD is mostly already installed on CentOS, this is just to be certain
Install modules and tools for SSL/HTTPS - defaults to true
Note: HTTPD needs to be configured properly to enable HTTPS in there. See README for details.
Install Tomcat - defaults to true Enable the service - defaults to false
Note: Tomcat could be pre-installed, but usually isn't enabled by default. This will make sure it is installed. It is not launched by default for security reasons. Note: on CentOS 5 you get Tomcat 5 (tomcat5), on CentOS 6 you get Tomcat 6 (tomcat6), on CentOS 7 you get Tomcat 7 (tomcat)
The earlier version of mlvagrant was using public_network, and that will likely reappear as option soon. Handing out of IPs in that case depends on the external DHCP of the network you happen to be connected with. If you are running on a laptop, and take it elsewhere, your laptop, and public_network VMs will get new IPs. At that moment the hosts tables become outdated. You can fix that with a simple command though:
vagrant hostmanager
That will go over all VMs, get its current IPs, and update the hosts tables on host and all VMs.
Scaling up or down is not too big an issue, just make sure you follow below steps accurately:
To scale up:
vi project.properties
, increase nr_hosts
vagrant status
, make note of names of not-created VMsvagrant up --no-provision {names of 'not-created' VMs}
(no need to provision the other ones again)vagrant hostmanager
(will run across all VMs and host to update hosts tables with new VMs)vagrant provision {names of 'not-created' VMs}
To scale down:
vagrant status
, make note of last VM namehttp://lastvmname:8001/
)vagrant destroy {lastvmname}
vi project.properties
, decrease nr_hosts
settingNote: you can scale down multiple hosts, but make sure to remove VMs from last to first. VM names are calculated by incrementing from 1 to nr_hosts. So, better not to leave gaps.
If you don't provide a valid license upfront, the slave nodes likely won't be able connect to the master. Installation of MarkLogic 5 likely fails alltogether. You will need to open Admin UI on the master node (the first VM), apply a valid license, and then restart MarkLogic on all VMs. The lazy way to do the latter is to simply halt all VMs, and bring them up again (vagrant halt ; vagrant up
)
Note: the correct procedure is to install valid licenses on all slave nodes as well. Open Admin UI via those hosts, and apply a license there as well.
A local git repository with a post-receive hook is initialized for you, together with a user-account for it. All you need to do to push any git repository onto the server is (assuming project name 'vgtest'):
git remote add vm vgtest@vgtest-ml1:/space/projects/vgtest.git
git push vm
The name of the user is derived from the folder name. The password is initialized to equal the user name, but can be changed if desired through:
vagrant ssh vgtest-ml1
sudo passwd vgtest
The bootstrap scripts contain a few safeguards that should allow running it outside (ML)Vagrant as well. I have used them on a fair number of internal demo-servers with success, also to create fully operational clusters in just a few steps. The procedure is a little different, but will save you a lot of manual typing:
sudo mkdir -p /space/software
sudo mkdir -p /opt/mlvagrant
sudo chown $USER:sshuser /space/software
sudo chown $USER:sshuser /opt/mlvagrant
scp Downloads/MarkLogic-8.0-5.x86_64.rpm <node1 name/ip>:/space/software/
scp Downloads/mlcp-8.0-5-bin.zip <node1 name/ip>:/space/software/
scp <mlvagrant project dir>/opt/mlvagrant/* <node1 name/ip>:/opt/mlvagrant/
bootstrap-server.sh
script that you could take as example.chmod +x /opt/mlvagrant/*.sh
#! /bin/sh
echo "running $0 $@"
./bootstrap-centos-master.sh -v 8 <node1 name/ip> <projectname>
#! /bin/sh
echo "running $0 $@"
./bootstrap-centos-extra.sh -v 8 <node1 name/ip> <nodeN name/ip> <projectname>
scp /space/software/MarkLogic-8.0-6.x86_64.rpm <nodeN name/ip>:/space/software/
scp /space/software/mlcp-8.0.6-bin.zip <nodeN name/ip>:/space/software/
scp /opt/mlvagrant/* <nodeN name/ip>:/opt/mlvagrant/
Next, initiate MarkLogic bootstrapping on every machine, one by one. This will also by default install MLCP, Java, Git, NodeJS, and other useful tools, and make the MarkLogic instances join together in a cluster:
cd /opt/mlvagrant/
./bootstrap-node1.sh
http://<node1 name/ip>:8001/
(MarkLogic Admin UI)Congratulations, you should have a working cluster. Now you can start deploying your MarkLogic applications on it!
To be able to use HTTPS in HTTPD, you need mod_ssl and openssl libraries. Both are installed by mlvagrant by default.
Next to that, you will need a properly signed certificate. Your local IT department will likely be able to help with that. Here a general description of the practical steps required: https://wiki.centos.org/HowTos/Https#head-37cd1f5c67d362756f09313cd758bef48407c325 (section 2).
Once there, you can start configuring HTTPD for using HTTPS. You can do that per VirtualHost, as described here: https://wiki.centos.org/HowTos/Https#head-35299da4f7078eeba5f5f62b0222acc8c5f2db5f (section 3), but you can also consider configuring the SSLCert* properties on global level in /etc/httpd/conf.d/ssl.conf
.
While in there, also consider applying a more strict SSL policy:
SSLCipherSuite ECDH+AESGCM:DH+AESGCM:ECDH+AES256:DH+AES256:ECDH+AES128:DH+AES:ECDH+3DES:DH+3DES:RSA+AESGCM:RSA+AES:RSA+3DES:!aNULL:!MD5:!TLSv1.0:!SSLv3
SSLProtocol ALL -SSLv2 -SSLv3
Once you have applied those SSL properties globally, you only need to add SSLEngine on
to the appropriate VirtualHosts in /etc/httpd/conf/httpd.conf
.
Restart HTTPD to apply config changes.
MlVagrant now also includes setting up PM2. It is recommended to create an appropriate pm2 user upfront for running the pm2 service 'globally'. The following installation guide give an impression of how PM2 can be used to both deploy and run project code on a server: https://github.com/marklogic/slush-marklogic-node/blob/master/app/templates/INSTALL.mdown#deploying-to-a-server
It has once been reported that the download of a basebox got interrupted, and subsequent attempts were failing. You can go round this manually by deleting the temporary download file that is mentioned in the error messages. You might also be able to work around this with the vagrant box
. vagrant box list
will show which boxes are present, you can then try to remove
the one that causes trouble, or perhaps use update
in an attempt to get it fixed.
If you install the vagrant-vbguest plugin, you will get notifications like these below if it notices a mismatch between your local installation of VirtualBox, and the Guest Additions on the vm:
==> ml9-ml1: Checking for guest additions in VM...
ml9-ml1: The guest additions on this VM do not match the installed version of
ml9-ml1: VirtualBox! In most cases this is fine, but in rare cases it can
ml9-ml1: prevent things such as shared folders from working properly. If you see
ml9-ml1: shared folder errors, please make sure the guest additions within the
ml9-ml1: virtual machine match the version of VirtualBox you have installed on
ml9-ml1: your host and reload your VM.
ml9-ml1:
ml9-ml1: Guest Additions Version: 5.1.6
ml9-ml1: VirtualBox Version: 4.3
Usually as long as the Guest Additions version is higher than that of your VirtualBox, you should be good. If it is behind just a little, like GA version 5.1.6 versus VBox Version 5.1.8, you are probably good too. We noticed issues though with older baseboxes that have GA version 5.0.2 in combination with VBox version 5.1.x. You can use the vagrant-vbguest plugin to install the correct Guest Additions if necessary. You can find documentation for its command-line usage here:
https://github.com/dotless-de/vagrant-vbguest#running-as-a-command
In the event Guest Additions and VirtualBox are too much out of synch, you could get a message like this:
[vagrant-ml1] No Virtualbox Guest Additions installation found.
Updating GuestAdditions skipped.
This seems to be the case when using the centos-7.2 basebox (one of the most recent currently) against VirtualBox 5.2, and typically occurs after the first halt/up. In that case you are forced to reinstall them. You can do so easily using:
vagrant vbguest --do install --no-cleanup
Once completed, you should be able to halt/up your vm(s) as pleased without any further issues.
Pretty simple:
vagrant halt
vi project.properties
, change master_memory
, master_cpus
according to needs, and also slave_memory
, and slave_cpus
if you have more than one node in your local clustervagrant up
Done!
Backup the virtual box vm to be sure, or at least any data on it that needs to be preserved no matter what.
vagrant halt
Settings
Storage
sectionHard Disk
(to the same controller)VMDK
type (Virtual Machine DisK)Dynamically Allocated
(to make the host file strech according to usage)box-disk2
as name, and select 40Gb as size (or something else according to available space and needs)Settings
dialogvagrant up
vagrant ssh
sudo service MarkLogic stop
sudo fdisk -l
will reveal a new /dev/sdb, but the space is still unallocatedsudo fdisk /dev/sdb
to start allocating the spacen
to add a new partition to that diskp
for primary partition1
to add the first primary partition<enter>
twice to include all available sectors into the partitiont
to change the partitions system id8e
to pick Linux LVMw
to write the changes and exit fdisksudo fdisk -l
will reveal both /dev/sdb, and the /dev/sdb1 partition nowsudo df -T
, it should be xfs
(or maybe ext4
for older or centos6 baseboxes)sudo mkfs.xfs /dev/sdb1
(or sudo mkfs.ext4 /dev/sdb1
for the ext4 type)sudo pvcreate /dev/sdb1
(confirm to ignore the warning)sudo vgs
, it should report centos
(or maybe VolGroup
in older baseboxes)sudo vgextend centos /dev/sdb1
(or sudo vgextend VolGroup /dev/sdb1
)sudo lvs
, it should report root
and swap
, both connected to the centos
volume group (lv_root
and lv_swap
in older baseboxes)root
volume with: sudo lvextend -l +100%FREE /dev/centos/root
(or sudo lvextend -l +100%FREE /dev/VolGroup/lv_root
)sudo xfs_growfs /dev/centos/root
(or for ext4: sudo resize2fs /dev/VolGroup/lv_root
)sudo vgs
, sudo lvs
, and sudo df -h
sudo service MarkLogic start
exit
Done!