CoreyKaylor / Lightning.NET

.NET library for LMDB key-value store
Other
398 stars 82 forks source link

Debugging erratic performance #96

Closed mr-miles closed 7 years ago

mr-miles commented 7 years ago

Hi,

We are using lightningdb to cache binary data in a service. The data file is approx 600Mb; keys are 5 bytes and values are about 2kb. Clients of the service send requests containing 200-2000 keys, the service pulls the values out for those keys and returns them.

Running locally, we can get <100ms return times consistently. However, we've found the read performance varies massively depending on the machine, with the worst culprit taking 10s to return the data. A second request can then either take <100ms or 10s - unpredictably.

Do you know of any pertinent performance counters or diagnostics we can look at to dig further into what's happening?

My previous experiences with lightningdb have all been fantastic "it just worked" ones (thanks!), and reading about how lmdb works there doesn't seem to be a lot that could go wrong. So I'm confused about how to diagnose what's going on here - any insight would be very much appreciated!

Miles

wanton7 commented 7 years ago

Do you have SSD drives on all computers? If not that might explain it. What i've read about LMDB it might do lot of random seeks when reading. Found this on the subject https://www.openldap.org/lists/openldap-devel/201502/msg00054.html

mr-miles commented 7 years ago

Thanks, that's useful. SSDs are indeed a difference between the two.

However the db should be in-order written and doesn't get a chance to go out of sync in these tests, so it's hard to see seeking around being the cause. But that email chain is too similar to dismiss so I'll check on that and see if I can get some ssds for testing.

Thanks for your help

Miles

On Fri, 5 May 2017 at 22:17, wanton7 notifications@github.com wrote:

Do you have SSD drives on all computers? If not that might explain it. What i've read about LMDB it might do lot of random seeks when reading. Found this on the subject https://www.openldap.org/lists/openldap-devel/201502/msg00054.html

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/CoreyKaylor/Lightning.NET/issues/96#issuecomment-299578031, or mute the thread https://github.com/notifications/unsubscribe-auth/AAkB8Ez5x0g20gs0BIhJOSaESS8QDdcOks5r25HcgaJpZM4NSR5P .

mr-miles commented 7 years ago

This was sorted in the end by redeploying to a windows 2016 machine with faster disks. The previous target had some other services running, plus it was windows server 2008 r2. We'll look into it a little bit further to see if its the disks or the 2008r2 that has the biggest impact.