gorules / zen

Open-source Business Rules Engine for your Rust, NodeJS, Python or Go applications.
https://gorules.io
MIT License
667 stars 61 forks source link

Getting timeout issue while executing the function #143

Open Yash1256 opened 2 months ago

Yash1256 commented 2 months ago

Hello team.. Was actually using the zen-engine in our project.. great work by the way !!! I was running into a Timeout Exceed error while running my function. I looked upon the documentation and find out that 50ms is the timeout which we have set for the function executions.. I was wondering if we can configure this timeout i.e. increase based upon the requirement, I was unable to find it in the documentation..

It would be very helpful, if there is a workaround for this problem or an exact solution to do, I request you to please help me out here..

Thanks

Yash1256 commented 2 months ago
{
  "code": 400,
  "message": "{\"type\":\"NodeError\",\"nodeId\":\"06a1d32a-863b-4887-88f3-b18d27908284\",\"source\":\"Timeout exceeded\"}"
}

This was the error which I was getting when trying to execute the function..

Yash1256 commented 2 months ago

https://github.com/gorules/zen/blob/82ff028e3ac7bfaea589c9f295ca2b7374a1ae18/core/engine/src/handler/function/mod.rs#L20

I am guessing this is the piece where we are setting the timeout for the function execution ? if I can modify this then how do I need to compile this and use in my project ? @stefan-gorules @ivanmiletic @mesaugat

ivanmiletic commented 2 months ago

Hi @Yash1256 we have intentionally limited FNs to 50ms to protect against infinite loops. Would you be able to provide:

Engine (e.g. Rust, Go...): Version: Runner (e.g. Macbook m1, Lambda, Docker...)

Yash1256 commented 2 months ago

Hey @ivanmiletic actually we are doing some large computations in the functions for which the time is getting exceeded 50ms..

Engine: nodejs Version: "@gorules/zen-engine": "^0.14.0", Runner: amazon/ubuntu/images/hvm-ssd/ubuntu-jammy-22.04-amd64-server-20231207

Yash1256 commented 2 months ago

If you would tell me the steps to increase the timeout and then how to build and use also that also works ? The code which I have pointed for timeout is that correct ? If I increase there and compile will that works ?

ivanmiletic commented 2 months ago

It is extremely hard to build the engines unless you use our pipelines. Let me see ETA for releasing it with configurable timeout as we have a lot of other priorities.

We would always suggest agains putting extremely complex logic in the FN.

For this complex code calculations we will expose a new Custom Node that will allow you to write any code in your programming language that can be executed by the engine, for example you will be able to write a Custom Node that can connect to other microservice, database or others.

This feature is new and not officially released, it exists in the code but is not yet documented.

Yash1256 commented 2 months ago

Thanks, but actually we need to increase the timeout on priority as we are planning to release thereby this will be highly required, so sometimes the code runs with timeout and sometimes it passes.. which will then sometime block our computation pipeline.

If you at your end can increase the timeout to 2s/3s (for being on the safer side) and can share me the build for the same it would also be very helpful @ivanmiletic Thanks

rwnd commented 2 months ago

@ivanmiletic More context here is that we have a function with an array of JSON input that it iterates and returns a sum of a derived value. The computation for the same JSON input is failing about 20-30% of the time with the Timeout Exceeded error. Just increasing the 50ms to 60ms would be sufficient for handling these failures for our use case. But since our computations happen on a async thread a much higher value would also work.

Would it be possible to make the function timeout value be a configurable parameter that can be initialized? That might possibly be a simple change if there's other configurable parameters used by the engine.

Yash1256 commented 2 months ago

Hey @ivanmiletic may you please help us out here ?

ivanmiletic commented 2 months ago

@Yash1256 Can you upgrade to a newest version 0.17.0 of zen engine for nodejs we have replaced v8 with quickjs for functions from 0.15.0. Again not sure if it will work but worth trying out

Yash1256 commented 2 months ago

Have you also removed the Timeout error and changed the error message in the latest version because I am recieving this message

{
    "code": 400,
    "message": "{\"type\":\"NodeError\",\"nodeId\":\"06a1d32a-863b-4887-88f3-b18d27908284\",\"source\":\"{\\\"description\\\":\\\"Error:9:15 interrupted\\\\n    at handler (eval_script:9:15)\\\\n    at <anonymous> (internals:23:13)\\\\n    at <eval> (eval_script:25:4)\\\\n\\\",\\\"message\\\":\\\"interrupted\\\"}\"}",
    "stack": "Error: {\"type\":\"NodeError\",\"nodeId\":\"06a1d32a-863b-4887-88f3-b18d27908284\",\"source\":\"{\\\"description\\\":\\\"Error:9:15 interrupted\\\\n    at handler (eval_script:9:15)\\\\n    at <anonymous> (internals:23:13)\\\\n    at <eval> (eval_script:25:4)\\\\n\\\",\\\"message\\\":\\\"interrupted\\\"}\"}"
}
ivanmiletic commented 2 months ago

Seems function is too heavy for 50ms on both JS engines.

We will create a task to expose limit config but can't prioritise the delivery as we have a few tasks from our enterprise and business customers.

In case you are able to expose it, we are open to contribution.

michaelnero commented 2 months ago

i've also hit this timeout issue perf testing a service using the v8 version of the zen-engine npm package. i can reproduce the timeouts under heavy load using even simple js functions that don't do a lot of work. i've looked at the code before you brought in quickjs, and though i certainly didn't put rigorous effort into proving this, my gut tells me that that code would start to randomly timeout under heavy load purely due to resource contention rather than the javascript function itself taking too long. at least this is also what testing heavy loads in a resource-constrained environment locally suggests.

i don't get any timeout errors using the quickjs version but the code is also different now, so we're all good, but i do want to 👍🏻 that at the very least, the timeout should be configurable. also -- and i can't think of a way to do this -- it would be great if we could distinguish between "i can't get enough time to execute your function because of the environment" vs "your function execution took too long" errors.

ivanmiletic commented 2 months ago

Hi for now please use the latest version with quickjs, it is lighter and startup times are better - we will take a look at the exposing env variable in near future.