JuliaSymbolics / Symbolics.jl

Symbolic programming for the next generation of numerical software
https://symbolics.juliasymbolics.org/stable/
Other
1.33k stars 147 forks source link

Array variable different from array of variables #379

Closed moble closed 1 year ago

moble commented 2 years ago

I think I'm not alone in expecting that indexing one element of an array variable should produce something that behaves the same as an ordinary (non-array) variable. For example, if I create

@variables v1 v[1:2]

then I expect v1 and v[1] to behave the same. They don't, and this causes some errors.

The problem is that, while v1 and v[1] are both Nums, they're different on the inside. Evidently, Num is just a thin wrapper around the val field, and that field contains different things in these two cases:

julia> typeof(v1.val)
SymbolicUtils.Sym{Real, Base.ImmutableDict{DataType, Any}}

julia> typeof(v[1].val)
SymbolicUtils.Term{Real, Nothing}

Note that one is a Sym, and the other is a Term. I guess Real is being broadcast across the elements of v or something??? (I don't pretend to understand how Symbolics and friends work.)

This seems to be the origin of several recent bugs: #370, #371, #372, and #376. In each of those cases, if I replace expressions like v[1:N] with v=[Symbolics.variable(:v, i) for i in 1:N], the problems go away. I guess you could argue that all downstream functions should be changed to accept both Sym and Term from here on out, but I feel like it might be better to just change the behavior of @variables v[1:2].

┆Issue is synchronized with this Trello card by Unito

ChrisRackauckas commented 2 years ago

No, removing array variables is 100% and unequivocally the wrong direction. Yes, array variables are hard. Yes, they are incomplete. No, we should not give up.

Array variables allow for O(1) representations of things like matrix multiplication. If you want to represent modern algorithms like neural networks symbolically, you need them. But then you need vector calculus, and you need vector codegen, etc. Instead of shying away from the problem, we are saying that this is one of the big problems we need to tackle. It's part of our longer term scope.

Until then, yes the easiest way to make arrays easier is to just scalarize them. collect(x), etc. That of course goes back to the O(n) representation and makes expressions explode, but it'll get stuff done today.

moble commented 2 years ago

removing array variables is 100% and unequivocally the wrong direction

Sorry, I don't understand all the terminology, but I didn't think I was suggesting this. All I want is for v1 and v[1] to behave in the same ways — even while v itself is something else. Are you saying this is impossible?

ChrisRackauckas commented 2 years ago

They should behave in the same ways, but they should not be the same. v[1:10000000000] could either be a length 10000000000 array of symbols, or it could be one symbol which stands for a 10000000000 length array. A*v can be O(10000000000^2) operations expanded out, or it could be A*v where A is n x 10000000000 and v is 10000000000. See https://symbolics.juliasymbolics.org/dev/manual/arrays/ . The latter is what we are trying to do for clear performance reasons.

Your approach v=[Symbolics.variable(:v, i) for i in 1:N] is to eagerly expand this into the 10000000000 array of symbols. While making everything scalar operations is easier in the short term, in the long term we hope to make array variable handling good enough that people don't have to do that. v[1] is fundamentally a different object: it's not a symbol, it's an operation on an array symbol while v1 is a scalar symbol. It still needs more work in order for it to act more naturally and error less though.

moble commented 2 years ago

Your approach v=[Symbolics.variable(:v, i) for i in 1:N] is to eagerly expand this into the 10000000000 array of symbols.

To be clear, I was not at all suggesting that this is desirable; quite the opposite — hence the issue. I just used it as a proof of concept to demonstrate what was going wrong, and as a temporary workaround. I object to this being referred to as my approach. :)

It still needs more work in order for it to act more naturally and error less though.

So I guess I'm suggesting that there was a recent regression, because #370, #371, #372, and #376 were the only examples I could find where this was coming up.

ChrisRackauckas commented 2 years ago

Array variables were added in August in the breaking v1.0 release.

shashi commented 2 years ago

I think downstream code should as much as possible not use the types Sym and Term but use the interface functions such as istree, operation and arguments.

ChrisRackauckas commented 2 years ago

Agreed. I think one issue is nameof, but yes those kinds of issues should get fixed and the user shouldn't have to worry about the difference.

shashi commented 2 years ago

There's interface for that now as well. issym(x)==true implies nameof(x) works.

moble commented 1 year ago

Nowadays, the example I gave above returns

julia> typeof(v1.val)
SymbolicUtils.BasicSymbolic{Real}

julia> typeof(v[1].val)
SymbolicUtils.BasicSymbolic{Real}

and the issues I mentioned above seem to be fixed (or failing for different reasons).