project-iris / iris

Decentralized cloud messaging
iris.karalabe.com
Other
571 stars 32 forks source link

Queue implementation is O(n^2) #62

Open eapache opened 9 years ago

eapache commented 9 years ago

I was reading https://github.com/project-iris/iris/blob/master/container/queue/queue.go out of curiosity and noticed that technically its behaviour is quadratic in the number of elements, instead of the (presumably) intended linear.

This is ameliorated by the default blockSize being a large constant factor, and the circular behaviour, but it's still true that inserting a large number of elements would result in copying a quadratic number of elements, which is not good for performance or scalability.

Growing the block list by a factor instead of single element when necessary would generally be more efficient.

karalabe commented 9 years ago

Hmmm, although you are right in that growing is effectively O(n^2), in reality it's n^2 / 2^24 (as in (n/2^12)^2), which should imho be small enough that you shouldn't care much. I would say that if you are having datasets that reach those enormous sizes in the number of elements, than the GC has already killed your program in runtime :D

The reason I didn't use growing by a factor is not to waste space as I thought this is good enough. If however you do have a scenario where it's not, I'd be happy to reconsider :)

Btw, the queue is originally from my cookiejar toolbox, you can find a few more datasets there.

eapache commented 9 years ago

in reality it's n^2 / 2^24 (as in (n/2^12)^2), which should imho be small enough that you shouldn't care much

That's what I was referring to when I said: "This is ameliorated by the default blockSize being a large constant factor".

if you are having datasets that reach those enormous sizes in the number of elements, than the GC has already killed your program in runtime :D

probably true :)

The reason I didn't use growing by a factor is not to waste space as I thought this is good enough. If however you do have a scenario where it's not, I'd be happy to reconsider :)

It was honestly the wasted space that made me pay attention. For queues < 2048 elements in size, your approach actually wastes more space than growing by a simple factor of two.

I only started poking around after implementing https://github.com/eapache/queue/ for another project (which does the simple double-when-full thing) and wondering if there were other implementations I could use instead.