[Proposal] Generalize SparseVector to allow non-stored value other than zero.

jlapeyre commented 3 years ago

SparseVector is designed assuming that all values that are not stored explicitly are equal to zero. I have a use case in which the non-stored value should be one(::MyType), rather than zero(::MyType).

You can get this by modifying SparseVector with

neutral(x) = zero(x)
isneutral(x) = iszero(x)
neutrals(::Type{T}, args...) where T = zeros(T, args...)

and replacing most occurrences in sparsevector.jl of zero, iszero, and zeros (for converting to dense vector) with neutral, isneutral, and neutrals. This should leave all current code using SparseVector unchanged. (Except that it appears that there is a small performance hit. I'm not sure why this is the case. I guess this is not essential, and can be fixed.)

Then, I can define neutral(x::MyType) = one(x), etc.

I have started doing this by copying code from sparsevector.jl, renaming SparseVector, and making the edits. But, this seems wasteful.

Of course, you may need to define more methods for your use case. The main difference is in multiplying an element by the neutral element A large amount of the linear algebra code will not give correct results. But, in my case, I don't need most of the linear algebra stuff.

A more flexible, but more complicated design might involve traits. Rather than hard code a "neutral" element for a type, add a type parameter to SparseVector that is itself a type. This parameter encodes the neutral element, and perhaps other behavior.
Are there any use cases for a similar generalization of SparseArray?

KristofferC commented 3 years ago

I think it is quite fundamental to a lot of the sparse code that this "neutral" element acts like a zero in the sense of neutral(x) + x = x and neutral(x) * x = neutral(x).

But, in my case, I don't need most of the linear algebra stuff.

In this case, couldn't you just use a Dict with all the non-neutral elements?

jlapeyre commented 3 years ago

I think it is quite fundamental to a lot of the sparse code that this "neutral" element acts like a zero

Yes, this is important. This is one reason I delayed opening this issue. The other is that I guess my use case is not very common. On the other hand, there is a significant amount of code in sparsevector.jl that does work independently of the properties of zero.

In this case, couldn't you just use a Dict

It's convenient to use the Array interface. In fact, I have parametric types that can use either a dense or sparse vector-like object as a field. I can immediately use Vector. But, I need a modified SparseVector. I suppose I could wrap a Dict in a subtype of AbstractVector. I don't have a clear idea of the sizes of my objects and exactly how they will be used, and what implications this has for performance of the sparse vector vs. Dict approaches. But, a big difference is that there are design and optimization choices in SparseVectors that I would have to repeat and debug, etc.

More concretely, I have been working with this

struct OpTerm{W<:AbstractOp, T<:AbstractVector{W}, CoeffT}
    ops::T
    coeff::CoeffT
end

where AbstractOp represents an operator (say linear operator) on a single degree of freedom (or mode) and the field ops is a Vector of such operators. In particular, AbstractOp might be

struct FermiOp <: AbstractFermiOp
    ind::Int
end

or

struct Pauli <: AbstractPauli{Complex{Int}}
    hi::Bool
    lo::Bool
end

For some applications, it turns out that ops is dense. For others, ops is quite sparse, with most entries being the identity operator. (E.g. electronic Hamiltonians may have many modes, but non-identity operators on at most four modes per term.)

I'm not 100% sure that using sparse vectors here is the right approach, and if so, how much functionality I need. I put this issue up as an RFC. E.g. to see if others have had similar ideas. It's probably best to begin by copying the sparse vector code to a new repo and work on it there.

jlapeyre commented 3 years ago

Here is a quick implementation that is enough to support my current use case.

https://github.com/jlapeyre/SparseArraysN.jl

JuliaSparse / SparseArrays.jl

[Proposal] Generalize SparseVector to allow non-stored value other than zero. #72