Glimpse / Glimpse.Prototype

Glimpse v2 prototype
MIT License
184 stars 44 forks source link

In-memory storage data structures - fixes #55 #62

Closed mike-kaufman closed 8 years ago

mike-kaufman commented 8 years ago

Take a look at the changes to InMemoryStorage and let me know what you think. I optimized for efficient runtime complexity and not minimizing locking.

The commit messages don't look correct. I'm going to see if I can fix those.

nikmd23 commented 8 years ago

Thanks for taking this on @mike-kaufman!

I'm happy you wrote some tests around it. Our test coverage basically non-existent at this point, but this is something important to make sure we've got right.

Did you do any testing on the throughput of this? Any idea of how many messages a second can be persisted? Is it safe to assume query time has stayed basically the same?

mike-kaufman commented 8 years ago

@nikmd23 - I didn't do any perf testing. Algorithmically, it should be constant time. The only concern here is lock contention. The previous implementation was using concurrent dictionary, and only using the "global" lock if it needed to remove entries. However, this was algorighmically slower than my changes (O(n)), and it wasn't correctly finding the oldest request.

I do think that if lock contention proves to be a concern down the road, then the we can do a similar approach as ConcurrentDictionary - namely we use a set of locks, and we choose one based on request ID's hash code (e.g., lock(_locks[requestId.GetHasCode()%_locks.Size). We'd still need a global lock for updating the linked list, but the window where this is taken will be small.

Anyway, my preference here would be to keep it simple & keep it algorithmically fast, and then optimize for any lock contention issues if & when they arise.

Note that one piece I left out, that should probably be done here for true thread safety, is that the read/query methods should lock as well.

nikmd23 commented 8 years ago

Sounds like a good idea @mike-kaufman.

We've got some load testing time in the backlog, so if there is any perf issues, we can cross that bridge then.

mike-kaufman commented 8 years ago

@nikmd23, wrt your question on timing, one of the tests I added is running 25 concurrent threads each inserting 80 "messages" among 40 requests. So this is 25 threads inserting a total of 2000 messages Simultaneously. This completes in under a 1/2 a second, on my machine, in Visual Studio, with test overhead.

avanderhoorn commented 8 years ago

Looking great! Thanks so much for getting this rebased on the latest changes.... the world changed on you :S Great work! As a side note, whats LRU mean?

mike-kaufman commented 8 years ago

LRU is "least recently used". The list let's us maintain & find the least-recently-used request in constant time. I can rename this if not clear. let me know.

mike-kaufman commented 8 years ago

Also, I can squash the changes in my branch if that simplifies things? Not exactly clear to me if this should be done during the pull, or if I should do it on my branch, or if it doesn't matter. Let me know

avanderhoorn commented 8 years ago

Best to do it on your branch... means that when we pull it in, it goes in as one feature/commit. But either way its not a major issue.

On 8 October 2015 at 11:08, Mike Kaufman notifications@github.com wrote:

Also, I can squash the changes in my branch if that simplifies things? Not exactly clear to me if this should be done during the pull, or if I should do it on my branch, or if it doesn't matter. Let me know

— Reply to this email directly or view it on GitHub https://github.com/Glimpse/Glimpse.Prototype/pull/62#issuecomment-146641366 .

mike-kaufman commented 8 years ago

OK, squashed into one commit. Let me konw if there is more feedback.

nikmd23 commented 8 years ago

Thanks Mike!