SymbolicML / DynamicQuantities.jl

Efficient and type-stable physical quantities in Julia
https://symbolicml.org/DynamicQuantities.jl/dev/
Apache License 2.0
132 stars 17 forks source link

Make SparseArrays a weak dependency? #53

Closed devmotion closed 11 months ago

devmotion commented 12 months ago

In Julia >= 1.10, SparseArrays won't be included in the system image anymore. To reduce loading times, therefore SparseArrays was or is about to be demoted from a regular to a weak dependency for many packages in the ecosystem such as FillArrays, Distances and PDMats. However, if a package in the dependency stack loads SparseArrays all these extensions will be compiled and loaded. Could the SparseArrays dependency of DynamicQuantities also be moved to an extension or removed alltogether to avoid that it triggers all these extensions?

MilesCranmer commented 12 months ago

Thanks for raising this @devmotion. I'm not sure if there's a way around this because SparseArrays is actually used internally for the SymbolicDimensions struct: https://github.com/SymbolicML/DynamicQuantities.jl/blob/87e1eb3d0d4c139ed94a563e2fae3cababc5e512/src/symbolic_dimensions.jl#L37-L42

It is used here because there are ~100 physical symbolic dimensions (m and cm are separate dimensions), so rather than store 100 fields, one for each dimension, I store it as a single sparse vector.

Perhaps it is possible to write a small internal sparse vector library to get this working without loading all of SparseArray.

What do you think?

devmotion commented 12 months ago

I wonder if it would be possible to replace _data with a dictionary or, more similar to the sparse array, with two vectors ind::Vector{Int} (or even sym::Vector{Symbol}?) and val::Vector{R} of indices of non-zero symbolic dimensions and their values.

MilesCranmer commented 12 months ago

Definitely possible. But it would add to the maintenance/testing burden to need to unit test a custom sparse array package, and possibly introduce new bugs. (Whereas SparseArrays works out-of-the-box)

devmotion commented 12 months ago

Maybe my suggestion was not completely clear: I don't think one should implement a new sparse array type but possibly a different data structure (array or two vectors) could provide the same functionality without significant regressions. And since SparseVector is implemented based on two vectors, I think the alternative with two vectors would be a natural candidate (https://github.com/JuliaSparse/SparseArrays.jl/blob/034d23420bba5a8688fba8d857fc7f813f121f31/src/sparsevector.jl#L38-L39).