haskell / containers

Assorted concrete container types
https://hackage.haskell.org/package/containers
318 stars 178 forks source link

IntSet range construction #632

Open treeowl opened 5 years ago

treeowl commented 5 years ago

fromDistinctAscList [a,b..c] takes time proportional to the number of elements. We should be able to write a specialized function that does better when b-a is less than 58 or so, and significantly better when it is very small. The idea is to fill up a whole Tip in one go.

mikeplus64 commented 5 years ago

It might be nice to just have a special enumFromThenTo for IntSet. fromDistinctAscList [a, b .. c] can then use a rewrite rule to use that.

treeowl commented 5 years ago

@mikeplus64 , such a rewrite rule would probably be impossible to write. enumFromThenTo gets rewritten to use build, and I don't think the things it's rewritten to are exported. In any case, it's a godawful mess and we should probably steer clear. But yes, a specialized version of some sort seems quite reasonable.

mikeplus64 commented 5 years ago

Ah, that's a shame. But yeah enumFromTo and enumFromThenTo would still be nice and probably much simpler (i.e. more performant) due to knowing upfront the whole range.

treeowl commented 5 years ago

Indeed, it's likely somewhat helpful to know the final tree shape in advance.

amigalemming commented 5 years ago

I like to have that function in order to initialize a range as start set for a prime sieve.

meooow25 commented 1 year ago

This should be pretty straightforward to implement. I can put together a PR.

meooow25 commented 1 year ago

Somehow I managed to overlook that that original comment is about [a,b..c], thinking that it's [a..b] instead. [a,b..c] won't be that straightforward. Anyway, no harm in adding the [a..b] version I think.

jwaldmann commented 1 year ago

I like to have that function in order to initialize a range as start set for a prime sieve.

Really - und what do you need that prime sieve for ... Besides education and entertainment - which are totally valid of course. I am thinking - if the (candidate) primes are small, then you don't need any fancy data structures, and if they're large, then they will be sparse, and IntSet block structure won't help much? (average gap between $p$ and next prime is $1/\log p$)

But anyway, could you extract benchmarking code from your application?