SGrondin / bottleneck

Job scheduler and rate limiter, supports Clustering
MIT License
1.83k stars 79 forks source link

Script errors running in cluster mode - ERR value is not an integer or out of range #145

Closed sapientsteve closed 4 years ago

sapientsteve commented 4 years ago

Running bottleneck 2.17.1, using AWS Elasticache redis 5.0.6. We have 2 AWS environments (accounts) that are identical in terms of the node runtime and Elasticache version and they are running the exact same code (node 12). In one environment, bottleneck is perfectly happy using redis, in the other environment, bottleneck is throwing errors.
Bottleneck initialization options:

{
    "id": "api_throttler",
    "maxConcurrent": null,
    "minTime": 66.667,
    "datastore": "redis",
    "timeout": 33000,
    "clientOptions": {
        "host": redis_host,
        "port": 6379
    }
}

Debug messages from the "bad" environment show:

debug: msg=Calling Redis script: register.lua, data=[1580052652636,"awlg87bwrkh","2wj5xtc7r75","1",""]
debug: msg=Event triggered: error, data=[
{
    "command": "EVALSHA",
    "args": [
        "c455d8a6915285f079ceb0dec89a61a1f831d10b",
        8,
        "b_api_throttler_settings",
        "b_api_throttler_job_weights",
        "b_api_throttler_job_expirations",
        "b_api_throttler_job_clients",
        "b_api_throttler_client_running",
        "b_api_throttler_client_num_queued",
        "b_api_throttler_client_last_registered",
        "b_api_throttler_client_last_seen",
        1580052652636,
        "awlg87bwrkh",
        "2wj5xtc7r75",
        "1",
        ""
    ],
    "code": "ERR"
}
]
bottleneck limiter error: ReplyError: ERR Error running script (call to f_c455d8a6915285f079ceb0dec89a61a1f831d10b): @user_script:226: ERR value is not an integer or out of range 

The messages above are repeated for every limiter task. I've tried comparing the bottleneck debug messages between the environments and I can't identify any differences except that one environment fails on every call. FWIW, prior to using bottleneck, the code successfully interacts with redis to do other things.

I'm also using limiter.ready() to wait for bottleneck to be happy.

Any tips on how to resolve this would be greatly appreciated. We've been using Bottleneck for the past year and it's been a lifesaver.

sapientsteve commented 4 years ago

Duh...the problem was not using an integer for the bottleneck config minTime property. That value is being calculated based on an API rate limit that is based on "transactions per second". Making that value an integer solved the problem.

Not sure any fix is warranted other than possibly adding a check on that config value because the error message that comes back from redis via bottleneck wasn't very helpful for me.

RichardWright commented 3 years ago

@sapientsteve Hi steve, are you using redis with cluster mode enabled? I'm having some issues with it and I'm trying to establish if bottlebeck is compatible with it.

sapientsteve commented 3 years ago

@RichardWright yes. we are using cluster mode.

thelaughingwolf commented 3 years ago

I've also seen this error, although I had already set minTime.

Configuration:

{
    maxConcurrent: 5,
    minTime: Math.ceil(1000 / rateLimit), // Provided from env variable, currently 700
    id: `apiv3-zendesk-${host}`,
    // Clustering options
    datastore: "redis",
    clearDatastore: false, // true to clear Redis on start
    timeout: (5 * 60 * 1000), // 5 minutes
    clientOptions: { . . . }
}

I was using Bottleneck with local storage without any problems. I added Redis support and tested my app a few times. After a handful of restarts, I started getting the same error @sapientsteve saw, and was likewise confused. I added clearDatastore: true to the Bottleneck options and the app was able to start. I then changed the options back to clearDatastore: false, and have not seen the error again.

I wish I could provide more detail, but I have been unable to reproduce the error since then. I failed to save the error or my configuration at the time, unfortunately. It's possible I hadn't added Math.ceil() to the config yet, although I thought I did that first.

thelaughingwolf commented 3 years ago

Since my last comment, I've had the issue recur. I've tried with reservoirs, I've tried using ioredis instead of node-redis, I've tried almost every variation of configuration, and occasionally, Bottleneck just gets constant errors.

I am completely unable to replicate the errors reliably. When they start happening, only fiddling with the configuration options can (eventually) fix the problem, but that's obviously not a production-ready solution.

I opened a dedicated issue for this: https://github.com/SGrondin/bottleneck/issues/183