nginx / unit

NGINX Unit - universal web app server - a lightweight and versatile open source server that simplifies the application stack by natively executing application code across eight different programming language runtimes.
https://unit.nginx.org
Apache License 2.0
5.37k stars 323 forks source link

Feature request: applications as upstreams #978

Closed ingria closed 10 months ago

ingria commented 11 months ago

It would be a useful feature to be able to use applications as upstreams. Something like that:

{
    "listeners": {
        "*:8001": {
            "pass": "upstreams/app_upstream"
        },
    },

    "upstreams": {
        "app_upstream": {
            "applications": {
                "app_copy_01": { "weight": 1 },
                "app_copy_02": { "weight": 1 }
            }
        }
    },

    "applications": {
        "app_copy_01": { "config" },
        "app_copy_02": { "config" }
    }
}

For example, it could be useful for launching multiple instances of single-thread applications (my case).

lcrilly commented 11 months ago

You can configure the number of application instances (processes) with the applications/…/processes object (see docs).

{
    "applications": {
        "app_00": {
            "processes": 2,
            "…": "…"
        }
    }
}
ingria commented 11 months ago

Thanks. In my case that’s not an option, because each application instance needs different environment variables (for example, prometheus telemetry endpoint port)

lcrilly commented 11 months ago

Most applications can be scaled horizontally (with processes) so it seems unlikely that this enhancement will become a priority in the short-to-medium term.

However, there are a couple of ways that you might achieve the desired outcome with existing features:

  1. A localhost listener for each application instance, with a wildcard listener that load balances between them.
  2. A JavaScript function in the router that directs incoming requests to one of the application instances (based on a hashing algorithm).

Option 1 config

{
    "listeners": {
        "*:8001": {
            "pass": "upstreams/app_upstream"
        },
        "127.0.0.1:9001": {
            "pass": "applications/app_copy_01"
        },
        "127.0.0.1:9002": {
            "pass": "applications/app_copy_02"
        }
    },

    "upstreams": {
        "app_upstream": {
            "servers": {
                "127.0.0.1:9001": {},
                "127.0.0.1:9002": {}
            }
        }
    },

    "applications": {
        "app_copy_01": { "config" },
        "app_copy_02": { "config" }
    }
}

Option 2 config

{
    "listeners": {
        "*:9000": {
           "pass": "`applications/${split.clients(remoteAddr)}`"
        }
    },

    "applications": {
        "app_copy_01": { "config" },
        "app_copy_02": { "config" }
    },

    "settings": {
        "js_module": "split",
    }
}
function clients(param) {
    var c = require('crypto');
    var i = c.createHash('md5').update(param).digest().readInt16BE();
    return i > 0 ? 'app_copy_01' : 'app_copy_02';
}
export default { clients }

Option 2 should be faster because it avoids unnecessary network calls. However, it balances load based by pinning the remote IP address to an application instance which may not be optimal. A production-grade solution would accept an array of application names instead of being hard-coded to just two. This is just a PoC to illustrate the options.

ingria commented 10 months ago

I've completely missed the js_module feature, thanks for the tip! It solves the problem for me.

Should I close this issue now or keep it open as feature request?

ac000 commented 10 months ago

Thanks. In my case that’s not an option, because each application instance needs different environment variables (for example, prometheus telemetry endpoint port)

Actually you can specify per-application environment variables in the config, e.g (taken from our docs...)

{
    "type": "python 3.6",
    "processes": 16,
    "working_directory": "/www/python-apps",
    "path": "blog",
    "module": "blog.wsgi",
    "user": "blog",
    "group": "blog",
    "environment": {
        "DJANGO_SETTINGS_MODULE": "blog.settings.prod",
        "DB_ENGINE": "django.db.backends.postgresql",
        "DB_NAME": "blog",
        "DB_HOST": "127.0.0.1",
        "DB_PORT": "5432"
    }
}
ingria commented 8 months ago

Yes, but in that case all processes will have the same environment variables.

What I need is multiple instances with different environment variables.

Something like that:

"app_instance_1": {
    "type": "python 3.6",
    "processes": 1,
    "working_directory": "/www/python-apps",
    "environment": {
        "DB_ENGINE": "django.db.backends.postgresql_node_1",
        "DB_NAME": "blog",
        "DB_HOST": "127.0.0.1",
        "DB_PORT": "5432"
    }
},

"app_instance_2": {
    "type": "python 3.6",
    "processes": 1,
    "working_directory": "/www/python-apps",
    "environment": {
        "DB_ENGINE": "django.db.backends.postgresql_node_2",
        "DB_NAME": "blog",
        "DB_HOST": "127.0.0.1",
        "DB_PORT": "5432"
    }
}