Tensor: abstract tensor calculations (no components): calculating derivative wrt symbol within tensor expression does not work

joha2 commented 4 years ago

Hey guys,

First of all: Sympy is a great Python package! Keep up the good work!

I recently realized that the tensor module can be used for performing abstract tensor calculations without relying on any specific component representations. I played around a bit and tried to perform a derivative of a tensor expression by a specific tensor utilizing directional derivatives (sample code below). The directional derivative relies on inserting a small perturbation into the expression and calculate the derivative by a real variable, but this leads to an error. Since I am not a sympy power user, probably I do not understand how to perform the task properly. But maybe it is a bug. Please find my sample code below:

import sympy
from sympy.tensor.tensor import (TensorIndexType, 
                                 tensor_indices, 
                 tensorhead)
from sympy import Wild

# wild cards

w0 = Wild('w0')

# expansion parameter

tau = sympy.Symbol('tau')

Euclidean = TensorIndexType('Euclidean', dummy_fmt='E')
g = Euclidean.metric
a = tensorhead('a', [Euclidean])
b = tensorhead('b', [Euclidean])
i, j, k = tensor_indices('i j k', Euclidean)
db = tensorhead('{{\\Delta}b}', [Euclidean])

print(a(i)*b(-i))
# a(E_0)*b(-E_0)

# Calculate tensor derivative w.r.t b(-i) by using a directional derivative
# https://en.wikipedia.org/wiki/Tensor_derivative_(continuum_mechanics)
# 1. expand b(-i) by b(-i) -> b(-i) + tau*db(-i)

print((a(i)*b(-i)).replace(b(-w0), b(-w0) + tau*db(-w0)))
# a(E_0)*(b(-E_0) + tau*{{\Delta}b}(-E_0))

print(((a(i)*b(-i)).replace(b(-w0), b(-w0) + tau*db(-w0))).expand())
# a(E_0)*b(-E_0) + tau*a(E_0)*{{\Delta}b}(-E_0)

# 2. calculate derivative w.r.t expansion variable tau -> BOOM

((a(i)*b(-i)).replace(b(-w0), b(-w0) + tau*db(-w0))).expand().diff(tau)
# Traceback (most recent call last):
#  File "<stdin>", line 1, in <module>
#  File "/home/joha2/anaconda3/envs/sage/lib/python3.7/site-packages/sympy/core/expr.py", line 3095, in diff
#    return Derivative(self, *symbols, **assumptions)
#  File "/home/joha2/anaconda3/envs/sage/lib/python3.7/site-packages/sympy/core/function.py", line 1198, in __new__
#    it cannot be differentiated.''' % expr))
#ValueError: 
#Since there are no variables in the expression a(E_0)*b(-E_0) +
#tau*a(E_0)*{{\Delta}b}(-E_0), it cannot be differentiated.

# 3. would be: set `tau = 0` and isolate coefficient of db(-i) which 
# gives the tensor derivative of expression by b(-i)
# Should also work for more general expressions 
# and after canonicalization and contraction also for 
# expressions with more complicated monoterm symmetries

Also I realized that some of the code examples in the doc strings of the tensor module lead to errors. I will file an issue for that later.

Best regards Johannes

PS: sympy.__version__ shows 1.4

jksuom commented 4 years ago

There is no error on current master (and v.1.5):

>>> ((a(i)*b(-i)).replace(b(-w0), b(-w0) + tau*db(-w0))).expand().diff(tau)
Derivative(a(E_0)*b(-E_0) + tau*a(E_0)*{{\Delta}b}(-E_0), tau)

However, the result is unevaluated: tensor expressions have no _eval_derivative.

joha2 commented 4 years ago

@jksuom thanks for your immediate answer! So, I'll just have to update to 1.5? Anyway, at the end of the day, I need an evaluated expression where the tau part is differentiated. How can this achieved, best? Is there anything which can be contributed to sympy in this sense? The second question comes from the future step 3.: How to obtain the tensor coefficient of db(-E_0) afterwards? Is there a best-practise way for this?

Best wishes Johannes

jksuom commented 4 years ago

The proper fix would be adding the missing _eval_derivative methods to tensor module.

joha2 commented 4 years ago

Is there any documentation on this? In particular I am interested in the properties the _eval_derivative method is supposed to have.

Best wishes Johannes

jksuom commented 4 years ago

I would look into the implementations of Add._eval_derivative and Mul._eval_derivative. For TensMul it is probably necessary to preserve the order of the arguments.

joha2 commented 4 years ago

Hey @jksuom thanks for the hints. I've looked into the matter and came up with a derived class from TensMul and TensAdd:

class MyTensMul(TensMul):

    def _eval_derivative(self, s):
        non_tensor_parts = []
        tensor_parts = []
        for myfactor in self.args:
            if not isinstance(myfactor, Tensor):
                non_tensor_parts.append(myfactor)
            else:
                tensor_parts.append(myfactor)
        return Mul(*non_tensor_parts).diff(s)*TensMul(*tensor_parts)

class MyTensAdd(TensAdd):

    def _eval_derivative(self, s):

        return TensAdd(*[MyTensMul(*term.args).diff(s)
                         for term in self.args])

myexpr = ((a(i)*a(j)*b(-i)*b(-j)).replace(b(-w0), b(-w0) + tau*db(-w0))).expand()
myexpr = MyTensAdd(myexpr)
diffmyexpr = myexpr.diff(tau)
print(diffmyexpr)

First, I know that these methods have to be implemented into TensMul and TensAdd, but is the implementation above the right way to go? Its limitation is that it only cares about the non-tensor parts of the expressions and I think it is not acting properly on non-expanded tensor expressions, but this is the first step for a derivative by tensor objects.

Best wishes Johannes

jksuom commented 4 years ago

It looks like the MyTensAdd version assumes that all arguments are of type MyTensMul. Would it be possible to use the same code as Add._eval_derivative: return self.func(*[a.diff(s) for a in self.args]) that does not make any assumptions about the types?

joha2 commented 4 years ago

Good point! Although, I am not sure what the allowed sub types of TensAdd are. It could be obviously Tensor, TensMul, another TensAdd (in case it is not flattened). For the right index signature (i.e. scalar) it could also be a symbol, correct? What does self.func do?

jksuom commented 4 years ago

What does self.func do?

It is usually the class of self, in this case MyTensAdd (or a subclass that inherits this method if one has been defined). See https://docs.sympy.org/latest/tutorial/manipulation.html#func.

joha2 commented 4 years ago

Thanks for the clarification! I will have a look. Is there a more detailed documentation for the tensor module than the doc tests alone?

jksuom commented 4 years ago

I don't think that there is anything else than the collected docstrings: https://docs.sympy.org/latest/modules/tensor/index.html.

joha2 commented 4 years ago

Hey @jksuom I forked now from sympy master and created my own branch 180148_tensor_derivatives and implemented _eval_derivatives for TensAdd, TensMul and Tensor. Further I added _diff_wrt = True and is_scalar = True to Tensor. Anyways, I've got a few questions:

Is it good practise to perform a doit() on TensAdd or TensMul created in an intermediate expression?
Can the is_scalar = True lead to any inconsistencies? From my point of view it does not look like that, but the documentation only states that diff(mysymbol(), mysymbol()) = 1 in this case. A tensor without any indices is not used in any sense and the diff(a(i), a(j)) and diff(a(i), a(i)) leads to the correct results. (Also longer expressions lead to correct results.)
As far as I see, I have not changed the order of factors in the tensor expressions, but I do not oversee, whether it is a problem to have anticommutative tensors. Is there anything special to be considered?
Is there any documentation on how to create doctests in a best practise way?

I will provide some code snippets for a slight review later, if this is ok and appropriate.

Best wishes Johannes

Edit: some typos and clarifications Edit: -i to i in diff(a(i), a(i)) since this is the correct index structure: contravariant indices in tensor derivatives become covariant and vice versa

jksuom commented 4 years ago

I don't know the tensor module well enough to answer these questions. Maybe @Upabjojr could do that.

joha2 commented 4 years ago

@jksuom thanks for help anyway. The code snippets are given by:

class TensAdd(...):
...
    def _eval_derivative(self, s):
        # Evaluation like Add
        return self.func(*[a.diff(s) for a in self.args])

def TensMul(...):
...
    def _eval_derivative(self, s):
        # Evaluation like Mul
        args = list(self.args)
        terms = []
        for i in range(len(args)):
            d = args[i].diff(s)
            if d:
                terms.append(TensMul(*(args[:i] + [d] + args[i + 1:])))
        return TensAdd.fromiter(terms)

class Tensor(...):
...
    def _eval_derivative(self, s):
        (selfhead, selffreeindices) = self.args
        (otherhead, otherfreeindices) = s.args
        if selfhead is otherhead:
            mykroneckerdeltas = []
            for (iself, iother) in zip(selffreeindices, otherfreeindices):
                if iself.tensor_index_type is not iother.tensor_index_type:
                    raise ValueError("index types not compatible")
                else:
                    mytensorindextype = iself.tensor_index_type
                    mytensormetric = mytensorindextype.metric
                    kroneckerdelta = None
                    if iself.is_up == iother.is_up:
                        kroneckerdelta = mytensorindextype.delta(iself, -iother)
                    else:
                        dummy = TensorIndex('dummy', mytensorindextype,
                                            is_up=iself.is_up)
                        kroneckerdelta = mytensormetric(iself, dummy)\
                                         * mytensorindextype.delta(-dummy,
                                                                   -iother)
                    mykroneckerdeltas.append(kroneckerdelta)
            return TensMul.fromiter(mykroneckerdeltas)
        else:
            return S.Zero

There are still some cases which are not considered, yet. Further, maybe it would be useful to introduce a Scalar head like in xTensor for arbitrary functions which depend on a scalar combination of tensors, but this is another topic, and should maybe handled in another feature branch.

Best wishes Johannes

Upabjojr commented 4 years ago

The reason why tensor derivatives are not implemented is that partial derivatives are generally coordinate system-dependent. The tensor module sympy.tensor.tensor was conceived to have coordinate system independent expressions, that is, the expressions should be the same in any coordinate system.

It makes sense to extend the tensor module to also operate on specific coordinate systems, but that makes the design a bit more complicated. I haven't been able to come up with a clear and easy way to represent both coordinate system -independent and -dependent symbols. It's an API design issue.

_eval_derivative generalizes to the partial derivative operator in the tensor module. There are other kinds of derivatives, such as covariant derivatives, which should also be supported.

joha2 commented 4 years ago

Dear @Upabjojr,

Thanks for the explanation. The derivatives which I implemented in #18093 are no partial derivatives with respect to coordinates which introduce a coordinate dependency as you mentioned. I just implemented formal derivatives by tensor components and derivatives by a scalar symbol. As far as I understood, both leads still to coordinate independent expressions since the index structure stays correct and there are no chainrule correction as Christoffel symbols to be considered. (Please verify.)

Further, I did not understand _eval_derivative as pure partial derivative. Shouldn't it be possible to differentiate by arbitrary quantities in a tensor expression?

You're right, partial derivatives are another type of problem (where one probably has to introduce dependencies into Tensor or the notion of a tensor field/density together with a connection is necessary) and together with the consideration of arbitrary scalar functions (where maybe a new head for encapsulation is needed), this is still on my wishlist for derivatives in the tensor module.

The question is now: How to proceed with my PR? Is it still appropriate (i.e. does not break your API) or should I discard it or is there a third alternative?

Thanks for your help and best wishes Johannes

Upabjojr commented 4 years ago

The question is now: How to proceed with my PR?

Your PR looks good, maybe just a few edits and we can merge it. Let's keep discussing there.

joha2 commented 4 years ago

OK. So what's to be done, that it can be merged? Since I am not very experienced with the sympy/tensor API, how should I proceed to take your points into consideration?

Should the TensExpr have a diff_partial method to be introduced, which calls _eval_partial_derivatives?
Further, should my _eval_derivatives be renamed to _eval_partial_derivatives?
And since the partial derivatives are still on my wishlist in an abstract manner: How to incorporate this into your proposed class?
Since this introduces another path in user experience (diff_partial vs. diff), we should also update the documentation/doctests
diff should throw an error and explain the situation (as you suggested): What error type is suited best? NotImplemented? Further the error should refer to diff_partial.
Are there any other points to be considered?

Best regards Johannes

Edit: Grammar, clarifications, some additional points to be discussed

Upabjojr commented 4 years ago

Sorry for the delay in my reply, I was on Christmas holidays.

OK. So what's to be done, that it can be merged? Since I am not very experienced with the sympy/tensor API, how should I proceed to take your points into consideration?

Should the TensExpr have a diff_partial method to be introduced, which calls _eval_partial_derivatives?

Supporting partial derivatives is desirable, I'm not sure whether to add it as an method or as an external operator.

Further, should my _eval_derivatives be renamed to _eval_partial_derivatives?

It should definitely be renamed to something else, _eval_derivative is not uniquely defined for tensors (therefore users should call the specific method).

And since the partial derivatives are still on my wishlist in an abstract manner: How to incorporate this into your proposed class?

I would say

class TensPartialDerivative(TensExpr):
    def __new__(cls, expr, v):
        obj = TensExpr.__new__(cls, expr, v)
        obj._expr = expr
        obj._v = v
        return obj

    @property 
    def expr(self):
        return self._expr

    @property 
    def v(self):
        return self._v

    def doit(self, **kwargs):
        expr, v = self.expr, self.v
        if isinstance(expr, TensMul):
            ... code for TensMul expr being derived by v ...
            return result
        elif isinstance(expr, TensAdd):
            ... code for TensAdd expr being derived by v ...
            return result
        elif isinstance(expr, Tensor):
            ... code for TensAdd expr being derived by v ...
            return result

In this way you create a partial derivative object (e.g. pd = TensPartialDerivative(A(mu)*A(-mu), A(nu)) and you only compute the partial derivatives if you call pd.doit().

Indeed it's sometimes useful to represent the partial derivatives without computing the result (in SymPy, we like to represent any kind of formulae).

Since this introduces another path in user experience (diff_partial vs. diff), we should also update the documentation/doctests

That's not a problem, it's just three lines of code in an RST file to add. Somewhere in _sympy/doc/modules/, IIRC.

Maybe the new file should rather be called tensor_operators.py? Otherwise, if you agree on adding a new class, it's fine not to create a new file.

diff should throw an error and explain the situation (as you suggested): What error type is suited best? NotImplemented? Further the error should refer to diff_partial.

I'm not sure about that. Maybe this can be done separately in another PR.

Upabjojr commented 4 years ago

If we are going to drop support for Python 2.7, an alternative to isinstance( ... ) is to use singledispatch.

asmeurer commented 4 years ago

Aren't _eval_ methods the same as singledispatch? The only difference is whether the implementation lives on the class or as a separate function.

Upabjojr commented 4 years ago

Aren't _eval_ methods the same as singledispatch? The only difference is whether the implementation lives on the class or as a separate function.

Yes, they are. Sometimes it might be more convenient to put the code in a single place.

joha2 commented 4 years ago

Hey @Upabjojr and @asmeurer thanks for your responses! No problem with the delay, too (on Christmas even the development on github may rest for a while :-)).

I took away from your answers, that:

We take your class proposal to implement partial derivatives; this has two advantages:
- We do not mixup partial and covariant derivatives in tensor expressions
- We get an abstract implementation of partial derivatives for free
In a first step, I will rename the method _eval_derivatives to _eval_partial_derivatives such that it is not called by accident from diff
In a second step, I will take your class proposal and call _eval_partial_derivatives in the doit method
The third step would be to remove _eval_partial_derivatives and let the proposed class do the work
The part about single dispatch functions I did not understand, therefore I would suggest to postpone this discussion/implementation to another PR (therefore sticking to the isinstance if statement)

One question comes to my mind:

Is v in the class definition above handled with the correct index structure? This means:
- First: Gets an upper index converted into a lower one and vice versa automatically?
- Second: Is v "counted" in the sense that TensPartialDerivative(a(i)*a(-i), a(k)) is a "tensor" of 1st rank?

Best wishes Johannes

Edit: minor typos

Upabjojr commented 4 years ago

The third step is not necessary, or maybe it can be just to the future.

Indeed there's a problem if you create TensPartialDerivative(a(i), a(-i))... the code should be able to handle the contraction.

On the other hand it should raise an exception if you call TensPartialDerivative(a(i), a(i)), as this contraction is not valid.

joha2 commented 4 years ago

From a mathematical point of view a derivative by a contravariant tensor is covariant and vice versa. therefore strictly speaking TensPartialDerivative(a(i), a(i)) should give D and TensPartialDerivative(a(i), a(-i)) should fail, see drawing.pdf g3572

In my submitted code, this case is considered by changing the position of the indices, maybe this should also be considered for TensPartialDerivative.

Upabjojr commented 4 years ago

therefore strictly speaking TensPartialDerivative(a(i), a(i)) should give D and TensPartialDerivative(a(i), a(-i)) should fail, see

Do you mean the opposite?

Anyway, Kronecker delta is returned only after .doit() has been called.

Consider these cases:

# 1:
B(-i)*TensPartialDerivative(A(i), C(a))

# 2:
B(-i)*TensPartialDerivative(A(i), C(-i))

They are both valid. In case # 2 the inner i should be replaced by a dummy index.

joha2 commented 4 years ago

I know from syntactic point of view TensPartialDerivative(a(i), a(-i)) seems to be correct, but from a mathematical point of view this means: differentiate a(contravariant) by a(covariant), which does not make sense. Again, #2 should fail, because independently from the free index i, the contravariant i for A and the covariant i for C do not fit together, since due to the derivative, the C has in fact a covariant i, therefore both indices do not fit together. Consider a third case:

# 3:
B(-i)*TensPartialDerivative(A(k), C(i))

which is perfectly valid from syntactic point of view, but since the last i is in fact covariant and the first too, this makes no sense from mathematical point of view.

g4193

Please compare the last two lines, where you can see in the second line, that the i indices do not fit, while the third line is ok. The first line should evaluate to Kronecker delta, which has to have mixed indices to be contractable, therefore the derivative by contravariant a is covariant (and vice versa). Another way to show this is given in: tensor_diff.pdf

Edit: Calculation in attachment

Upabjojr commented 4 years ago

Oh... you're right.

This will introduce some additional complications to the algorithm performing the dummy-index substitution. The current algorithm contracts covariant-contravariant index pairs, it needs to be modified to handle the case of partial derivatives (e.g. double contravariant pairs).

Upabjojr commented 4 years ago

There is already an implementation of PartialDerivative. The behaviour needs to be reviewed, have a look at the tests: https://github.com/sympy/sympy/blob/master/sympy/tensor/tests/test_tensor_operators.py

joha2 commented 4 years ago

Oh... you're right.

This will introduce some additional complications to the algorithm performing the dummy-index substitution. The current algorithm contracts covariant-contravariant index pairs, it needs to be modified to handle the case of partial derivatives (e.g. double contravariant pairs).

For the doit this should be easy, since in _eval_partial_derivative I just substituted the correct Kronecker deltas and metrics. But, for the abstract case, this could be a problem. I observed that you have some method get_free_indices: Could that method be used to return the correct index structure, such that the derivative part is recognized correctly?

There is already an implementation of PartialDerivative. The behaviour needs to be reviewed, have a > look at the tests:

In which relation is PartialDerivative to TensPartialDerivative? Or is this more or less a test, where some concepts are checked? Yesterday, I made a commit to my PR #18093, changing it like we discussed. I also incorporated the TensPartialDerivative class. Though, this needs also a review.

joha2 commented 4 years ago

I had a short look on the PartialDerivative code: Seems like, TensPartialDerivative performs nearly the same task, although PartialDerivative is more focused on component derivatives. Should these classes be separated, anyway?

In the test program, there occurs the same index problem, that we discussed above. I will prepare another PDF where I give my hand calculations and after a review we could maybe prepare another PR for correcting those.

Upabjojr commented 4 years ago

Should these classes be separated, anyway?

I would prefer to have one class.

In the test program, there occurs the same index problem, that we discussed above.

I'm aware of the problem, I'm trying to fix it right now.

I will prepare another PDF where I give my hand calculations and after a review we could maybe prepare another PR for correcting those.

OK, thanks. We also need to expand the unit tests to make sure that all cases are correctly tested.

joha2 commented 4 years ago

These are my findings, where the unit tests are not okay. review_PartialDerivative.pdf

Okay, since the PartialDerivative class was there earlier, I will merge the TensPartialDerivative into it and see, whether there are problems arising in the tests. Is this ok?

joha2 commented 4 years ago

OK, thanks. We also need to expand the unit tests to make sure that all cases are correctly tested.

Just a few ideas:

Standard rules:

Linearity (with arbitrary many summands)
Product rule (with arbitrary many factors)
Consistency: Comparison between different approaches of performing (see tensor_diff.pdf)

Higher order derivatives:

Flattening (as already tested)
Symmetry
(Linearity)
(Product rule)

Contractions:

Inner (divergence type)
Outer (vector gradient type) (as already tested)
Vector analysis identities (div rot = 0, rot grad = 0)
Contractions with Levi-Civita-Tensor

sympy / sympy

Tensor: abstract tensor calculations (no components): calculating derivative wrt symbol within tensor expression does not work #18048