TritonDataCenter / node-workflow

Task orchestration, creation and running using NodeJS
MIT License
456 stars 66 forks source link

Performance #124

Closed gepser closed 3 years ago

gepser commented 9 years ago

Hi, I was wondering if there is a way to improve the performance. I notice this: when the jobs are been added to redis, the worker isn't doing anything and just starts his work until the queue is full. This process could take a while, sometimes about 40 mins in a MacBook Pro and just adding 1k jobs to the queue, I think should be faster.

Another thing is when I try to add 10k jobs to the queue it's just not posible, it gives me an abort alert that there are not enough memory and just stops after a while trying.

Is there a way to improve this or am I missing something?

kusor commented 9 years ago

Hi gepser, I'll try to reproduce the worker isn't doing anything and just starts his work until the queue is full cause, if so, that's a bug of the wf-redis-backend module. The worker should begin pulling jobs as soon as there's one available.

Regarding the memory issues when adding 10k jobs to the queue, I'm afraid that's not only related to the wf module, but also to the redis server running on that same macbook.

Anyway, could you add the code you're using for the operations you mention here so I could take a look?. Thanks in advance!

gepser commented 9 years ago

Hi kusor, thanks for your answer. In order to make it easy for you to reproduce the case where "the worker isn't doing anything and just starts his work until the queue is full" I take your code (from the node-workflow-example repo), using the redis config and just modifying a little bit your code, I reproduced that case.

The things I'd modify: -I add the delete gist as part of the workflow because I didn't wanna have 5k unnusefull gists. -In the module.js file I'd add a loop enclosing the factory.job, you can see that in the attached image screen shot 2014-10-15 at 3 26 28 pm

... and that's it, first I run it with 1k requests and it was fast. The workers started to do their jobs too fast and I couldn't see anything wrong. So, I changed the loop to "5000" and then the bug was reproduced. In about 5 minutes both things started to show some activity (the module.js and the worker), so, the module.js takes too long (~5mins) to put the data on redis, I'd check it, no keys was added and suddenly 5k more keys did appear and after that the workers started their job.

Let me know if you could reproduce that case and tell me if I'm doing something wrong. I will look into your code and try to find why is that happening.

Thanks!

gepser commented 9 years ago

Update: the bug only appear when there is something on redis, if do a FLUSHDB before start everything works as expected, that is going to be my workaround for now.

cjonagam commented 9 years ago

Yes, even i observed this bug

-------- Original message -------- From: gepser Date:10/15/2014 2:52 PM (GMT-08:00) To: kusor/node-workflow Subject: Re: [node-workflow] Performance (#124)

Update: the bug only appear when there is something on redis, if do a FLUSHDB before start everything works as expected, that is going to be my workaround for now.

— Reply to this email directly or view it on GitHubhttps://github.com/kusor/node-workflow/issues/124#issuecomment-59282779.

kusor commented 9 years ago

Thanks for the update folks, specially for the flushdb clue. As you've already figured out at this point, I'm using the postgres backend b/c I want persistence. I'll take a look at this stuff during the weekend and try to figure out what's going on with the client when the DB has contents.

cjonagam commented 9 years ago

Any solution for this @gepser @kusor ?

gepser commented 9 years ago

Not yet, I think the solutions is a long term one. On my free times I am looking into this code in order to find the issue and make it better but I haven't found a solution yet, just the workaround. A possible and temporary fix could be doing the flushdb from the node app instead of doing it manually. @cjonagam

But we probably should have this issue open in order to other people see the workaround, but keep in mind that this is probably a long term issue and the solution probably needs write some parts of the node-workflow from scratch (I'm guessing).