Open fingon opened 6 years ago
Hi,
Thanks for your comments and questions, some answers:
Yep, I was talking about Python equivalent of e.g. del list[idx] or list[4:4] = [value].
Go data structures in general (and slices in particular) are bit too minimalistic for my liking, I have plenty of code that does multiple lines of boring things just to get something that is one line in Python.
MVPness of PEDS is quite cool. I looked at most of the other ds libraries recently and most of them did not do M or V.. :-p (e.g. 'sure, we have 20 data structures but some of them miss delete, and others have O(n) behavior for half operations. good luck,.')
Write path turned out to be prohibitively expensive for what I was doing with the library. At a guess only way this could really work if one used some sort of internal freelist pool, and allowed for slight inefficiency in memory usage. That sounds .. nasty. Sigh.
OK, that's a shame. Do you have any specific benchmarks that you can share? I haven't had time to produce any yet.
I have given some thought to memory management but not really come up with any viable optimizations:
I'm happy for more input on this! PR with benchmarks is also welcome! :-)
Hello,
[ ... ] Since the nodes may be shared some kind of reference counting would also have to be used which would then have to be protected by locks for thread safety.
Surely refcounting can just be done with atomic.AddUint32
, no? Something like this:
func (v *privatePersonBySsnItemBucketVector) Free() {
if atomic.AddUint32(&v.ref, ^uint32(0)) == 0 {
someGlobalPool.Put(v)
}
}
In the simplest case, someGlobalPool
could be implemented by a sync.Pool
. If we wanted to get fancy, we could implement it as a global PEDS set and a CAS loop. Thoughts?
Either way +1 for benchmarks.
Thanks for this nice little library!
Hi!
Sure, a atomic.AddUint32
or similar could probably be used for this! It would still put the burden on the user to free the objects though which is a shame. It may be OK though since it's more of an optimisation than something you would actually have to do.
I like your ideas around the pooling and would be happy taking a PR fleshing it out! (The same goes for benchmarks :-)).
Sure, a atomic.AddUint32 or similar could probably be used for this! It would still put the burden on the user to free the objects though which is a shame. It may be OK though since it's more of an optimisation than something you would actually have to do.
Good point! I always tend to forget that sync.Pool
isn't really manual memory management, and you can always offload the responsibility to the garbage collector.
Also, at the risk of being cheeky, PEDS itself is arguably an optimization! If users are reaching for this library, its already because they're trading convenience for performance. It doesn't seem unreasonable to push that tradeoff a bit further, especially if users don't have to use the Free
method.
I like your ideas around the pooling and would be happy taking a PR fleshing it out! (The same goes for benchmarks :-)).
I'm currently evaluating PEDS for a project of mine, but I'm not quite at the point where I'm using it. If/when that day comes, I'm not at all opposed to taking a stab at it, and I'll definitely be in touch for some guidance!
Thanks again for this great work!
First of all, great library. I think it is the first reasonable looking Golang data structure library that I have seen, as most of the stuff that drowns in locks or is functional is either really inefficient or awkward to use with multiple goroutines at once.
So, on to questions/requests:
Insert/Delete would be also nice to have in vector/vectorslice
why vector slice does not have ToNativeSlice? (it would be more efficient to take vector slice and then ToNativeSlice only parts you are interested about, as opposed to the whole vector)
any chance of perf benchmarks? the code looks sensible, but I am wondering how much of a hit I will get if I actually use this as opposed to dealing with locking costs + mutable datastructures
(I can guess that O(log32(n)) is heavily involved in some things that are O(1), but I wonder also about constants as well due to memory allocation etc.)