n8n-io / n8n

Free and source-available fair-code licensed workflow automation tool. Easily automate tasks across different services.
https://n8n.io
Other
46.97k stars 6.91k forks source link

Is horizontal scaling supported #37

Closed corporatepiyush closed 4 years ago

corporatepiyush commented 5 years ago

Is horizontal scaling of workflows across machines supported in current version ?

janober commented 5 years ago

Yes and no.

In the way you mean probably not. It is not possible if you start a workflow automatically via a node like Webhook, Cron or whatever. It is, however, possible for workflows which get startet via the CLI like described here: https://docs.n8n.io/#/start-workflows-via-cli

So it will work depending on your use-case.

n8n will, however, support proper scaling in future versions.

jkyberneees commented 4 years ago

Hi, n8n is a great project!

Is there any update on this topic? In case of horizontal scaling is not yet ready, are there mechanisms that allows n8n to resume an "ongoing" workflow after instance restart? For example, the docker instance running n8n gets killed in the middle of an execution?

Thanks in advance. Rolando

janober commented 4 years ago

Thanks, @jkyberneees great to hear that you like n8n!

Sorry, no update yet. It is sadly nothing which will be available in the next months as it will be quite some work and would so need a lot of resources that are currently not available. Will however start hiring some people soon to be able to tackle this and other things.

About your second question. No, all workflows do currently run only in memory. Nothing gets saved between steps. So if it gets killed while running there is no way to make it resume from where it stopped before. Also do not think, that this is something that would become default as it would slow everything down immensely. Something similar will probably be possible in the future but then only as a workflow setting (with global overwrite) to have that kind of behavior only if really necessary.

truong-hua commented 4 years ago

@janober So currently if we try to run multiple instances for redundancy, will the cron/interval triggered execution be duplicated? And regarding the webhook trigger, is it possible to get n8n instances share the workload among the cluster if the webhook request is randomly routed to a single instance?

janober commented 4 years ago

@huaphuoctruong n8n does not scale yet.

No that will sadly not work as nothing for that got implemented yet. If you start multiple n8n instances all of them will run exactly the same thing. That means all cron and interval triggers will start the workflows on each instance. Meaning if you have 10 n8n instances the same workflow will be started 10x every time.

Do sadly not understand the second part of your question. But if you just use Webhook-Nodes then you can simply start multiple n8n instances and route the requests to a random one. But also there it is important to know that if you activate a workflow it will only be activated on one instance and not all (as they do not communicate). So it means on any activation/deactivation and whenever a Webhook-Node gets added or removed all n8n instances have to get restarted.

truong-hua commented 4 years ago

@janober Got it, for the cron/interval case we can solve it by adding after the interval trigger a node that does setting and checking an atomic flag on redis or database.

According to your response above, should I understand that n8n will load all workflows saved in database to memory at init time and do updating only when users done modified a workflow? Basically, if we can have a hot reload feature which do interval checking for new updates on database and reload the modified workflows, it would solve the horizontal scaling problem, right?

Is it possible to do the hot reload now like sending a SIGHUP to the process?

janober commented 4 years ago

Another solution for the cron/interval is to run two different n8n instances. One which has only the workflows with cron/interval and then either runs the actual workflow or calls the workflow on the other "scaled" n8n via a webhook.

Not the whole workflows are saved in memory. Only the active workflows and their webhook-nodes (or rather their URLs). So it should be no problem to change other parts of the workflow and it would then still load the most current version from the database once called. But if a webhook-node gets added/removed or a workflow gets activated/deactivated it would only get updated on one n8n instance and that is why a restart is necessary.

Yes, hot-reload is one option. What is currently actually planned is to make the webhook-code stateless. So that it does not get saved in memory anymore, rather on activate/deactivate it writes an entry to the database with valid webhook-URLs and what workflow should be executed for it. That would then solve that problem as the n8n instance could then simply check the database and it would so not matter which n8n instance it would be. But would obviously only solve the webhook-scaling-problem not the one for other Trigger-Nodes.

To make n8n properly scale it is currently planned to use later something like https://cadenceworkflow.io/ underneath the hood.

jspizziri commented 3 years ago

@janober , any updates on this? Has scaling made it onto any concrete roadmap?

Love what you guys are doing here, and would really love to switch over from Huginn, but scaling is a big concern for us.

janober commented 3 years ago

Yes is in the making: https://github.com/n8n-io/n8n/pull/1294

janober commented 3 years ago

Btw. you can already give it a try and we are always happy to get feedback (information about how to test it can be found in the PR). If everything goes well we should have it merged in 1-3 weeks.

jspizziri commented 3 years ago

@janober

We'll take a look and be sure to leave any feedback if we have any. Thank you!

janober commented 3 years ago

Great, thanks a lot!

dwoldo commented 2 years ago

@janober I am trying to think of a work around to enable distributed Cron scheduling. In production we can't rely on a single node handling all poll/triggers (Cron). Do you have any ideas, practical or theoretical, for scaling the scheduled triggers? I know there are technologies, like Cadence Workflow, that can enable this scenario but any work on this topic?

Thanks!