sqlite - 1 extra call when running from two scripts

SpiffSpaceman commented 2 years ago

Hello, i run below code from two different py files at the same time. It looks like that there might be an issue. For each window, 1 extra call seems to pass through sometimes (ex - 20:54:14 )

At 2/sec and 60/min, i should hit the 60 limit after about 30 seconds. But it seems to happen after around 20 seconds. Also, sometimes 61 calls pass through instead of 60

import datetime
from pyrate_limiter import Limiter, SQLiteBucket, RequestRate, Duration

limiter = Limiter(
    RequestRate(2,   Duration.SECOND ),
    RequestRate(60,  Duration.MINUTE),
    bucket_class=SQLiteBucket,
    bucket_kwargs={'path': '/tmp/pyrate_limiter.sqlite'},
)

i = 0
def limitMe():
    global limiter, i
    with limiter.ratelimit( "Test", delay=True):
        i += 1
        print(i, datetime.datetime.now())

while True:
    limitMe()

Output ---

1 2021-12-18 20:54:10.003831                                1 2021-12-18 20:54:11.002484
2 2021-12-18 20:54:10.005163                                2 2021-12-18 20:54:11.005458
3 2021-12-18 20:54:14.005357                                3 2021-12-18 20:54:12.003581
4 2021-12-18 20:54:14.006829                                4 2021-12-18 20:54:12.005608
5 2021-12-18 20:54:15.015675                                5 2021-12-18 20:54:13.004611
6 2021-12-18 20:54:16.017773                                6 2021-12-18 20:54:13.006189
7 2021-12-18 20:54:17.019648                                7 2021-12-18 20:54:14.013142
8 2021-12-18 20:54:18.013569                                8 2021-12-18 20:54:15.007652
9 2021-12-18 20:54:18.015072                                9 2021-12-18 20:54:15.009291
10 2021-12-18 20:54:19.015664                               10 2021-12-18 20:54:16.009579
11 2021-12-18 20:54:19.017274                               11 2021-12-18 20:54:16.011135
12 2021-12-18 20:54:20.025862                               12 2021-12-18 20:54:17.011623
13 2021-12-18 20:54:21.019726                               13 2021-12-18 20:54:17.013142
14 2021-12-18 20:54:21.021131                               14 2021-12-18 20:54:18.016630
15 2021-12-18 20:54:22.021467                               15 2021-12-18 20:54:19.023759
16 2021-12-18 20:54:22.022795                               16 2021-12-18 20:54:20.017749
17 2021-12-18 20:54:23.026384                               17 2021-12-18 20:54:20.019240
18 2021-12-18 20:54:24.033477                               18 2021-12-18 20:54:21.022626
19 2021-12-18 20:54:25.028553                               19 2021-12-18 20:54:22.024452
20 2021-12-18 20:54:25.030052                               20 2021-12-18 20:54:23.023427
21 2021-12-18 20:54:26.030829                               21 2021-12-18 20:54:23.024933
22 2021-12-18 20:54:26.032425                               22 2021-12-18 20:54:24.026459
23 2021-12-18 20:54:27.033052                               23 2021-12-18 20:54:24.027888
24 2021-12-18 20:54:27.034598                               24 2021-12-18 20:54:25.036446
25 2021-12-18 20:54:28.035196                               25 2021-12-18 20:54:26.038554
26 2021-12-18 20:54:28.036772                               26 2021-12-18 20:54:27.041018
27 2021-12-18 20:54:29.055405                               27 2021-12-18 20:54:28.043134
28 2021-12-18 20:54:30.054569                               28 2021-12-18 20:54:29.044325
29 2021-12-18 20:54:31.049022                               29 2021-12-18 20:54:29.051676
                                                            30 2021-12-18 20:54:30.046738
                                                            31 2021-12-18 20:54:30.048332
                                                            32 2021-12-18 20:54:31.052249

JWCook commented 2 years ago

I can look into this sometime in the next few days. My hunch is that for this to work with multiple processes, we may need to put a given read+write within an exclusive transaction.

SpiffSpaceman commented 2 years ago

Hello, yes running it within a single transaction seems to solve this.

I tried to look at the code, it seems that try_acquire() supports multiple buckets and perhaps that adds some complications.

My own use is case is very simple since i have only 1 sqlite bucket, and it seems its working. I ran it few times and every time it took about 30s to hit the minute limit and only ran 60 times.

1) i begin transaction after _init_buckets() + 2) commit at the end of the function + 3) call rollback before BucketFullException. There might be a side effect that one of the instance seems to get favored more than other - 40 vs 20 / 35vs25 for ex. But i did start them manually one by one. Not much of an issue for me and my limits will be much higher and will rarely get hit.

But i dont know how that will impact other usecases with multiple buckets. If i try to obtain lock for each bucket before proceeding then there is risk of deadlock and if it try to lock for each bucket individually within the loop, then perhaps the two calls could interfere by satisfying individual rules. Best to wait for your fix.

Thanks for nice library.

SpiffSpaceman commented 2 years ago

Hello, 1) In addition, I also got errors from sqlite3 on multithreaded access in single script. "sqlite3.ProgrammingError: SQLite objects created in a thread can only be used in that same thread. The object was created in thread id 140152770070272 and this is thread id 140152736384768." I had to pass 'check_same_thread' as False in bucket_kwargs ( + 'isolation_level' = 'EXCLUSIVE' for above ).

2) Next i got another error "sqlite3.OperationalError: cannot start a transaction within a transaction" For this I had to use a threading lock to prevent simultaneous access of try_acquire(). With this it seems to work well for my use case now with multiple processes and multithreaded access within each process.

Both of this was done outside library code, but perhaps could be done internally too if it makes sense. Thanks

SpiffSpaceman commented 2 years ago

Just a note - Monotonic clock apparently resets after power reset - it seems to return time elapsed since last power on.

Because of this, if we try to get lock before restart ( which works ) and then again after power restart, remaining_time in Exception becomes corrupted as it compares values before and after power restart. I can simply use time.time which is good enough for me ( no dst here ), but this is a bit of a gotcha, esp if you need monotonic.

vutran1710 commented 2 years ago

Just a note - Monotonic clock apparently resets after power reset - it seems to return time elapsed since last power on.

Because of this, if we try to get lock before restart ( which works ) and then again after power restart, remaining_time in Exception becomes corrupted as it compares values before and after power restart. I can simply use time.time which is good enough for me ( no dst here ), but this is a bit of a gotcha, esp if you need monotonic.

Interesting finding. I will make a note of this in the document.

JWCook commented 2 years ago

Sorry for the delay on this! I started some changes in #61, but no unit tests yet.

I was hoping to be able to make changes only within SQLiteBucket, but it looks like some changes are needed in Limiter as well. I added a no-op AbstractBucket.commit() method, and called that within try_acquire(). That would also probably be needed for any other SQL-based backends added in the future.

For multi-threaded usage, I'm leaning toward putting the lock in SQLiteBucket instead of try_acquire(), since I don't believe the other bucket backends need this. I could be mistaken, though.

JWCook commented 2 years ago

Just a note - Monotonic clock apparently resets after power reset - it seems to return time elapsed since last power on.

Interesting, I didn't know that either. @vutran1710 I wonder if we should go back to using time.time() by default instead of monotonic? Seems like it has a few surprising behaviors, and users who want it can still opt in with time_function.

JWCook commented 2 years ago

Sorry, looks like this may not be fixed after all. I was working with the python multiprocessing module, which does not behave the same as your example of running from two separate scripts.

JWCook commented 2 years ago

Continuing this in #67

vutran1710 / PyrateLimiter

sqlite - 1 extra call when running from two scripts #60