JuliaData / IndexedTables.jl

Flexible tables with ordered indices
https://juliadb.org
MIT License
121 stars 37 forks source link

rename pool back to compact_mem and implement here #235

Closed piever closed 5 years ago

piever commented 5 years ago

In a discussion on making DataFrames more lightweight (ref: https://github.com/JuliaData/DataFrames.jl/issues/1764), it was pointed out that StructArrays also got a bit of bloat (which feels unnecessary for people using it for things other than data analysis). I have a plan to get rid of the PooledArrays dependency and Requires use for WeakRefStrings, but it requires moving the "pooling" functionality back here, so I've renamed pool (that we took from StructArrays) to the old name of compact_mem (pool is wrong anyway as we are only pooling some columns) and moved the implementation here (in the future I will deprecate the implementation in StructArrays when I get rid of the dependencies).

KristofferC commented 5 years ago

What's the point of Requires for stuff like WeakRefStrings? It loads extremely fast anyway, I would bet the overhead from Requires is larger than just loading it unconditionally.

piever commented 5 years ago

I agree. I would like to remove the dependency completely by adding a 2 line interface package - or two lines to Base julia - that encompasses all these packages (PooledArrays, CategoricalArrays, WeakRefStrings) so that row comparison can be made efficient for custom array storage X without depending on package X (see https://github.com/JuliaLang/julia/issues/31606 for details). In the meantime I agree with you that I may be better off depending on WeakRefStrings then putting it in Requires.