isi-vista / immutablecollections

A library for immutable collections, in the spirit of Guava's Immutable Collections.
MIT License
3 stars 1 forks source link

Immutable collections are extremely slow #3

Open gabbard opened 5 years ago

gabbard commented 5 years ago

Issue by rgabbard Friday Jun 01, 2018 at 19:13 GMT Originally opened as https://github.com/isi-nlp/isi-flexnlp/issues/349


Timings for immutable set creation:

empty frozen: base 0.0005451000106404535 ms vs probe 0.0016776999473222531 ms 
singleton frozen: base 0.000693500078341458 ms vs probe 0.0019225999494665302 ms 
non singleton frozen: base 0.0007623999408679083 ms vs probe 0.0021223000658210367 ms 
big set frozen: base 1.3013469000725308 ms vs probe 1.9578501000069082 ms 

empty immutable: base 0.0006041000233381055 ms vs probe 0.21783060001325794 ms 
singleton immutable: base 0.000735200046619866 ms vs probe 0.42865529994742246 ms 
non singleton immutable: base 0.0008420000085607171 ms vs probe 0.44097009995311964 ms 
big set immutable: base 1.2643873998968047 ms vs probe 100.92103900005895 ms 

vs” numbers are “time for creation of the list used for initialization” vs “time for creation of the set”. top numbers are frozenset bottom numbers are immutableset We are 50-200 times slower :-(

(this is for just object creation)

gabbard commented 5 years ago

Comment by rgabbard Friday Jun 01, 2018 at 19:15 GMT


A native implementation may be in order. I wonder if we could just steal the frozenset implementation from https://github.com/python/cpython/blob/491bbedc209fea314a04cb3015da68fb0aa63238/Objects/setobject.c and remove all mutating methods

@ConstantineLignos

gabbard commented 5 years ago

Comment by ConstantineLignos Friday Jun 01, 2018 at 19:31 GMT


We certainly could go the C route. I'm less worried about the implementation time than the time required to sort out distribution for all platforms (which will possibly have to kept up for every new minor Python version). It may be more maintainable to do it in Cython. I have to run something for a while later tonight. If I still have energy left at that point, I might try do see how far I can get in doing a version of ImmutableSet in Cython and seeing how much faster it is.

gabbard commented 5 years ago

Comment by ConstantineLignos Friday Jun 01, 2018 at 19:35 GMT


Digging in a little bit more, there probably won't be a difference between C and Cython for distribution; we're going to have to get into the business of building wheels either way. There are some solutions out there that can automatically build wheels for all Python platforms. Cython is less of a commitment (and is much more readable than a pure C extension) so I'd still like to give it a try first.

gabbard commented 5 years ago

Comment by rgabbard Friday Jun 01, 2018 at 19:47 GMT


Trying Cython first sounds fine to me

gabbard commented 5 years ago

Pursuing this seriously will require benchmarks: #17 #16 #15 #14