TritonDataCenter / manta-muskie

Manta WebAPI
Mozilla Public License 2.0
13 stars 17 forks source link

manta-muskie: The Manta WebAPI

This repository is part of the Joyent Manta project. For contribution guidelines, issues, and general documentation, visit the main Manta project page.

manta-muskie holds the source code for the Manta WebAPI, otherwise known as "the front door". API documentation is in docs/ Some design documentation (possibly quite dated) is in docs/internal. Developer notes are in this README.

Active Branches

There are currently two active branches of this repository, for the two active major versions of Manta. See the mantav2 overview document for details on major Manta versions.


Muskie tests use node-tap. Test files are all named "test/*/.test.js". Tests are divided into:

  1. Unit tests (test/unit/*.test.js). These can be run from either a git clone (make test-unit is hooked up to do this) or inside a deployed muskie (aka "webapi") instance.

  2. Integration tests (test/integration/*.test.js). These must be run from a deployed muskie instance.

Each test file must be written to run independently. This allows running tests in parallel and being able to understand what a test is doing without assumed external setup steps.

To run unit tests in a git clone:

make test-unit

To run all tests on a (non-production) muskie (aka "webapi") instance:

ssh DC-HEADNODE-GZ      # login to the headnode
manta-login webapi      # login to a webapi instance

"Runtests" by default shows a compact results summary. Full TAP output is written to "/opt/smartdc/muskie/test.tap". See the comment in runtests for various use cases for running the tests -- e.g. running individual test files, forcing TAP output.

To run the muskie test suite in the internal Joyent "nightly-2" DC run:
    TARGET_RIG: nightly-2
    CMD:        stage-test-manta-muskie

Cleaning up after tests

Many of the integration tests require actual user accounts with which to test. The code to handle this is in "test/helper.js#ensureTestAccounts" -- the account logins are prefixed with "muskietest_", account data is cached in "/var/db/muskietest".

For faster re-runs of the test suite, these accounts are not deleted after a test run. Generally this should be fine (the muskie integration tests shouldn't be run in a production datacenter). However, if necessary, you can delete the muskie test accounts fully by:

  1. Copying the "test/" tool to "/var/tmp" in the global zone; and

  2. Running the following from the headnode global zone:

    function reset_muskie_test_accounts {
        sdc-useradm search muskietest_ -H -o login | while read login; do
            I_REALLY_WANT_TO_SDC_USERADM_RM=1 /var/tmp/ $login
        manta-oneach -s authcache 'svcadm restart mahi'
        manta-oneach -s webapi 'svcadm restart svc:/manta/application/muskie:muskie-*'

Dev Cycle

If you are changing node.js code only, you may benefit from the "./tools/rsync-to" script to copy local dev changes to a deployed muskie on the headnode of a development Manta (then re-run the test suite):

vi ...                      # make a code change
./tools/rsync-to HEADNODE   # where HEADNODE is an ssh name to the dev headnode

For larger changes, refer to the Operator Guide for upgrading a webapi instance in a Manta setup.


Muskie exposes metrics via node-artedi. See the design document for more information about the metrics that are exposed, and how to access them. For development, it is probably easiest to use curl to scrape metrics:

curl http://localhost:8881/metrics

Notably, some metadata labels are not being collected due to their potential for high cardinality. Specifically, remote IP address, object owner, and caller username are not collected. Metadata labels that have a large number of unique values cause memory strain on metric client processes (muskie) as well as metric servers (Prometheus). It's important to understand what kind of an effect on the entire system the addition of metrics and metadata labels can have before adding them. This is an issue that would likely not appear in a development or staging environment.

Notes on DNS and service discovery

Like most other components in Triton and Manta, Muskie (deployed with service name "webapi") uses Registrar to register its instances in internal DNS so that other components can find them. The general mechanism is documented in detail in the Registrar README. There are some quirks worth noting about how Muskie uses this mechanism.

First, while most components use local config-agent manifests that are checked into the component repository (e.g., $repo_root/sapi_manifest/registrar), Muskie still uses an application-provided SAPI manifest. See MANTA-3173 for details.

Second, Muskie registers itself with DNS domain manta.$dns_suffix (where $dns_suffix is the DNS suffix for the whole deployment). This is the same DNS name that the "loadbalancer" service uses for its instances. If you look up manta.$dns_suffix in a running Manta deployment, you get back the list of "loadbalancer" instances -- not any of the "webapi" (muskie) instances. That's because "loadbalancer" treats this like an ordinary service registration with a service record at manta.$dns_suffix and load_balancer records underneath that that represent individual instances of the manta.$dns_suffix service, but "webapi" registers host records underneath that domain. As the above-mentioned Registrar docs explain, host records are not included in DNS results when a client queries for the service DNS name. They can only be used to query for the IP address of a specific instance. The net result of all this is that you can find the IP address of a Muskie zone whose zonename you know by querying for $zonename.manta.$dns_suffix, but there is no way to enumerate the Muskie instances using DNS, nor is there a way to add that without changing the DNS name for webapi instances, which would be a flag day for Muppet. (This may explain why muppet is a ZooKeeper consumer rather than just a DNS client.)

Dtrace Probes

Muskie has two dtrace providers. The first, muskie, has the following probes:

The second provider, muskie-throttle, has the following probes, which will not fire if the throttle is disabled:

The script bin/throttlestat.d is implemented as an analog to moraystat.d with the queue_enter and queue_leave probes. It is a good starting point for gaining insight into both how actively a muskie process is being throttled and how much stress it is under.

The throttle probes are provided in a separate provider to prevent coupling the throttle implementation with muskie itself. Future work may involve making the throttle a generic module that can be included in any service with minimal code modification.