High availability code for Mikrotik routers
RouterOS 6.44.6 is the only version that the author runs and tests with as of now. Anything newer is unknown and not recommended until tested extensively. Do not just upgrade RouterOS expecting it to work, Mikrotik has broken a series of features that ha-mikrotik relies on at various points in time. If you are not comfortable testing new versions that make break your entire setup, wait for the author or another party to confirm compatibility.
This has been tested stable across 6 different pairs of CCR1009s for over a year, there have been multiple adminstrative failover events and a few cases of hardware failovers. Please ensure you are using compatible hardware, RouterOS, and ha-mikrotik releases.
Please do not test this on production routers. This should be tested in a lab setup with complete out of band serial access. This was developed on the CCR1009-8g-1s-1s+ and is in use in our production environment. Proceed at your own risk, the code can potentionally wipe out all of your configuration and files on your device.
Extensive documentation is still needed. This is being delivered as a proof of concept. You will need to do a bit of code reading and testing to figure out how it works.
The #1 issue is a race condition during the startup of the secondary after it gets an updated configuration. It needs to quickly disable all of the interfaces so that it doesn't end up taking traffic (MACs are cloned) from the active router. If you use spanning tree on your switches, it is likely that this will happen fast enough and the Layer2/3 won't have time to come up and cause issues. Test this very carefully, you will get very strange results if your ports start forwarding instantly from your upstream switch.
Using a dedicated interface, VRRP, scripts, and backups, we can make a pair of Mikrotik routers highly available. Configuration and files are actively synchronized to the standby and the standby remains ready to takeover when the VRRP heartbeat fails.
Pair of CCR1009-8g-1s-1s+ RouterOS v6.33.5 Routerboard firmware 3.27 Bootstrapped from complete erased routers and then config built up once HA installed.
MIkrotik - REAL HA Configuration
/file remove [find]; /system reset-configuration keep-users=no no-defaults=yes skip-backup=yes
/import HA_init.rsc
$HAInstall interface="ether8" macA="[MAC_OF_A_ETHER8]" macB="[MAC_OF_B_ETHER_8]" password="[A RANDOM PASSWORD OF YOUR CHOOSING]"
$HASyncStandby
/import HA_init.rsc
$HAPushStandby
on the active, this should push the new code and reboot the standby.$HASyncStandby
on the active, there should be no changes (unless something else changed on the active inbetween).$HASwitchRole
on the active.Rebuilding failed hardware is similar to a new installation except that we don't need to reset both and don't need to bring in a new HA_init, assuming both RouterOS are compatible.
Install a compatible version of RouterOS on the new hardware and factory reset the configuration. Connect your new hardware the same exact way the old one was. We assume you have used the default of ether8 for the $haInterface.
If A is active, run from A:
$HAInstall interface=$haInterface macA=$haMacMe macB="[NEW_MAC_OF_B_ETHER8]" password=$haPassword
$HASyncStandby
.If B is active, run from B:
$HAInstall interface=$haInterface macB=$haMacMe macA="[NEW_MAC_OF_A_ETHER8]" password=$haPassword
$HASyncStandby
.