ohsu-comp-bio / funnel

Funnel is a toolkit for distributed task execution via a simple, standard API.
https://ohsu-comp-bio.github.io/funnel
MIT License
121 stars 32 forks source link

Considerations on scalability #602

Open kmavrommatis opened 5 years ago

kmavrommatis commented 5 years ago

Hi, I am using rabix + funnel with AWS batch backend to process data. The expectations are to be able to process hundreds (or even thousands) of datasets concurrently. Are there any considerations regarding the scalability of a rabix + funnel system? i.e. if I would like to submit 100s of workflows for execution would a single funnel server be able to handle? On the client side there would be 1000s of rabix instances running (unless I am missing something) so that seems to be an issue as well. Or is better to have multiple funnel servers? In this case would all of them be able to work using the same DynamoDB? and perhaps have one instance of it to check the status of the tasks?

Thanks for your help

adamstruck commented 5 years ago

A single Funnel server should be sufficient, in this scenario its not doing much beyond responding to polling requests from the workflow engine. The Funnel workers communicate with DynamoDB directly to update task states. On prem we are able to run thousands of tasks using a single Funnel server backed by Mongo without issue.

With regards to DynamoDB, I haven't really tested it at scale, but its advertised as being able to scale up read/write capacity as needed.

I don't have a good solution for managing the thousands of rabix instances... You may want to checkout out Cromwell for pushing CWL / WDL workflows through Funnel.

kmavrommatis commented 5 years ago

Thanks for the response. Unfortunately, as of now Cromwell does not seem to support TES with AWS backend, and the funnel webserver itself feels rather unstable, I will post another issue for this. I am running v0.8.0 from the docker image, and crashes constantly, this results in broken rabix executions because it cannot communicate to the TES server.