jonniedie / ComponentArrays.jl

Arrays with arbitrarily nested named components.
MIT License
291 stars 35 forks source link

Symbolic indexing should slice, not view #90

Closed jonniedie closed 1 year ago

jonniedie commented 3 years ago

getproperty should continue to be a maybeview situation, but getindex should always create a copy, even when one or more of the indices are symbols. The way it is right now, getindex just follows getproperty when there are symbol indices and regular slicing when there are none. This is kinda weird and surprising. Views should be explicitly made with @view, view, or Base.maybeview.

bgroenks96 commented 3 years ago

Hey! This broke my code! :(

bgroenks96 commented 3 years ago

I'm not sure I agree that it's surprising... but I can see how maybe this is more consistent with Julia array behavior.

That being said, I would still argue that getindex with a symbol should behave exactly like getproperty. This is what I expected and how I wrote my code.

I think it's a matter of philosophy whether the ComponentArray should behave like a data structure or purely as an array.

jonniedie commented 3 years ago

Sorry this broke your code! I've been trying to keep breaking changes to a minimum, but the last few changes that involved indexing/viewing have been something that's needed to happen for a while for a few reasons.

First is consistency with the array interface. There is a lot of room to play in whether ComponentArrays should be more struct-like or array-like. While ComponentArrays are supposed to act more like structs for the user, it's important that they follow the array interface in order to work inside of solvers. For a while, ComponentArrays always viewed, even when using numerical indices. This ended up causing some bugs because solvers (and things that use arrays in general) expect getindex on <:AbstractArrays to always create copies. So I changed it such that purely numerical/colon indexing would slice, but symbolic indexing would view. But then what happens when you have a higher-dimensional ComponentArray like a ComponentMatrix and one of the indices is a number or colon and the other is a symbol? Should it slice or view? It is much more consistent to just say getindex always slices and view always views.

Which brings up the second issue: How do you slice with symbolic indices if that's what you want? With the old behavior, there was no way to slice with symbolic indices. You had to view and then create a copy of the view. With the new behavior, slicing and viewing is done explicitly and works no matter what type of indices you have.

Again, I apologize for this breaking your code. It's not a decision I made lightly (and it actually broke some of my code as well, because I was depending on that behavior in a few cases). Thankfully, mine only took a few @views sprinkled here and there. Hopefully yours will be similar.

bgroenks96 commented 3 years ago

I do symbolic indexing programmatically, i.e. I loop over variable names and then do getindex to write to the sub-array, e.g:

    # set up default mass matrix
    M_diag = similar(setup.uproto)
    for layer in keys(setup.meta)
        # setup.meta is a named tuple
        progvars = setup.meta[layer][:progvars]
        algvars = setup.meta[layer][:algvars]
        for var in progvars
            M_diag[layer][varname(var)] .= one(eltype(M_diag))
        end
        for var in algvars
            M_diag[layer][varname(var)] .= zero(eltype(M_diag))
        end
    end

By hand via getproperty this would be something like M_diag.soil.T .= 1.0, as an example. So the issue here for me is that I use getindex as the programmatic equivalent of getproperty... though I just realized that I could presumably also invoke getproperty, though that's a bit ugly.

jonniedie commented 3 years ago

Try:

    # set up default mass matrix
    M_diag = similar(setup.uproto)
    for layer in keys(setup.meta)
        # setup.meta is a named tuple
        progvars = setup.meta[layer][:progvars]
        algvars = setup.meta[layer][:algvars]
        diag_layer = @view M_diag[layer]
        for var in progvars
            diag_layer[varname(var)] .= one(eltype(M_diag))
        end
        for var in algvars
            diag_layer[varname(var)] .= zero(eltype(M_diag))
        end
    end
bgroenks96 commented 3 years ago

diag_layer would also be a CompnentArray so I would have to make that a view too, I guess.

jonniedie commented 3 years ago

You don't have to view that because setindex! always sets into the array instead of a copy of the array

jonniedie commented 3 years ago

Oh, btw, there's a function called valkeys that's exported that will get Val-wrapped keys from your ComponentArray so that indexing is much faster than keys that you get by using keys.

bgroenks96 commented 3 years ago

I feel like I tested this before and observed no difference in performance? But maybe I tested getproperty and not getindex... I don't remember.

bgroenks96 commented 3 years ago

Anyway, this particular code snippet is not performance critical so no worries about that :)

jonniedie commented 3 years ago

Yeah, if you tested with getproperty, there would be no difference in performance because getproperty is able to constant propagate but getindex for some reason isn't (even though the code is the exact same for both on my end). It's a long-standing issue for ComponentArrays.

bgroenks96 commented 3 years ago

Probably Julia has special handling for getproperty. Have you tried benchmarking with getproperty invoked directly, maybe also with a non-constant argument?

bgroenks96 commented 3 years ago

By the way, another side note, I actually don't know if it's really so critical that ComponentArray works inside of a solver. I currently just store the axis information and then invoke the direct constructor ComponentArray(array, axis) on each step, which seems to be more or less cost-free (though please correct me if I am wrong). This lets you use regular array types in the solver but ComponentArrays in your code which feels like a best of both worlds to me.

jonniedie commented 3 years ago

This is a good method to have around as a backup, but I think it would be unfortunate if that were the only way you could use ComponentArrays in a solver. There's a fair amount of boilerplate doing it that way and I think it just makes things less nice to work with in general. Plus non-solver things like state-space systems with named states (see #88) also require ComponentArrays to keep their Component-ness through array-like operations.

jonniedie commented 3 years ago

This came up again on Slack today, so I'm going to go ahead and re-open this up to get some more opinions on it. I want to make sure this is the best indexing behavior before cementing it in place for v1.0 (which should come within the next few months?).

bgroenks96 commented 3 years ago

After thinking about it more, I think I agree with this change. Even for struct-like behavior, getindex is not defined by default, and when it is, it typically signals array-like behavior. Since the default behavior of getindex is usually to slice (as much as I vehemently disagree with this decision by the Julia devs), it makes sense that getindex with a symbol should follow this convention.

It was annoying that my code broke, but I can't think of any good argument against this change.