When deploying multiple Magpie units on the same baremetal machine, each communicating on different interfaces with different MTUs, Magpie appears to use the wrong reference interface to determine what MTU to expect. For instance, given machines having three interfaces with their respective configured MTUs as follows:
bondm MTU=1500
bond0.10 MTU=9000
bond1.7 MTU=9000
Magpie reports the following:
bondm: net MTU failed, mismatch: 1500 packet vs 9000 on iface bond0.10
bond0.10: net MTU failed, mismatch: 9000 packet vs 1500 on iface bondm
bond1.7: net MTU failed, mismatch: 9000 packet vs 1500 on iface bondm
Steps to Reproduce
Given:
A MAAS environment with machines configured to communicate over 2 or more networks with different MTUs
A Juju controller using the MAAS provider
Save the following bundle as ~/magpie-bundle.yaml:
Specifically, local_ip is obtained by calling the underlying hook tool unit-get private-address. However, this hook tool is problematic and the Juju team recommends using network-get instead (Reference bug). Trying out unit-get private-address manually, yields:
$ juju run --unit magpie-ceph-access/1 unit-get private-address
10.3.126.93
...which is the address of the bondm interface. Trying out network-get manually yields:
This will cause Magpie to set the "magpie" binding to "ceph-access-space" as well so explicitly setting "magpie" is not necessary. Using the above workaround will result in unit_private_ip() retrieving the correct IP albeit by chance.
Initial Observations
When deploying multiple Magpie units on the same baremetal machine, each communicating on different interfaces with different MTUs, Magpie appears to use the wrong reference interface to determine what MTU to expect. For instance, given machines having three interfaces with their respective configured MTUs as follows:
Magpie reports the following:
Steps to Reproduce
Given:
Save the following bundle as
~/magpie-bundle.yaml
:Then deploy as follows:
Code Analysis
Tracing the code that attempts to get the reference interface, we come across this snippet:
https://github.com/andrewdmcleod/magpie-layer/blob/4f6da1a197df866176fece7649aa6aa6a4ce4919/lib/charms/layer/magpie_tools.py#L107-L109
Specifically,
local_ip
is obtained by calling the underlying hook toolunit-get private-address
. However, this hook tool is problematic and the Juju team recommends usingnetwork-get
instead (Reference bug). Trying outunit-get private-address
manually, yields:...which is the address of the
bondm
interface. Trying outnetwork-get
manually yields:...which is the information for the correct interface. Thus, line 109 in the above snippet could be updated to:
Workaround
One workaround is to set the default binding to the one that magpie is intended to test. For example:
This will cause Magpie to set the "magpie" binding to "ceph-access-space" as well so explicitly setting "magpie" is not necessary. Using the above workaround will result in
unit_private_ip()
retrieving the correct IP albeit by chance.