Parallel-in-Time / time4apint

Mixing GFM method with PinT task scheduling
https://jupyterhub.mat.tu-harburg.de/blockops/
0 stars 0 forks source link

Factor with cost-relevant operator #12

Closed tlunet closed 1 year ago

tlunet commented 1 year ago

Hi @JensHahne,

following our discussion, here is the image with the correct task factorization for PFASST :

image

And I pushed a sympy testing script that can serve as basis to find the good sympy manipulation ...

tlunet commented 1 year ago

Putting the relevant Sympy documentation here : https://docs.sympy.org/latest/tutorials/intro-tutorial/manipulation.html

JensHahne commented 1 year ago

Just a quick update: sympy.collect() does not work for non-commutative symbols, and it seems that this is because the problem is quite difficult for these kinds of expressions.

I implemented a version based on string manipulation. It's pretty hacky, but I think it could work. I've been pushing it. One part is still missing, which is the return from str to sympy expression. There is an exception when I try it with simpify.

Unfortunately, I don't have much time today. Maybe you can find some time, otherwise I can try it tomorrow. If it works, the functions need to be added to run.py and called before self.taskGenerator(rule=rule, res=res) in createExpressions(self).

tlunet commented 1 year ago

I saw indeed for the sympy.collect() problem ... Saw what you tried with the string manipulation, and I'm currently on an other lead : the manual decomposition of the args element of the expanded expression, and use of dict to regroup the same terms together ... I'll push as soon as I get something running, then we can compare our two solutions ...

tlunet commented 1 year ago

Quick update :

  1. First of, I'm impressed by the simplify_string function ... you really went far to implement a string parser for the factorization :sweat_smile:
  2. I've pushed an alternative solution for the factorization using a dictionary, and how to extract the tasks from it. It's on the sympyTesting2.py script.

Here are the main ideas :

Factorized dictionnary :
 -- note : (+) are additions, (x) are multiplications
(+) u_1^0 (x) 1
(+) \tilde{\phi}**(-1) (x)
   (+) \chi (x) u_0^0
   (+) \phi (x)
      (+) -1 (x) u_1^0
(+) T_C^F (x)
   (+) \tilde{\phi}_C**(-1) (x)
      (+) T_F^C (x)
         (+) \chi (x) u_0^1
         (+) \phi (x)
            (+) -1 (x) u_1^0
            (+) \tilde{\phi}**(-1) (x)
               (+) \phi (x) u_1^0
               (+) \chi (x)
                  (+) -1 (x) u_0^0
List of tasks :
 -- T0
    -- operator : \chi
    -- input : u_0^0
    -- dependency : u_0^0
    -- result : \chi*u_0^0
 -- T1
    -- operator : -1
    -- input : u_1^0
    -- dependency : u_1^0
    -- result : -u_1^0
 -- T2
    -- operator : \phi
    -- input : -u_1^0
    -- dependency : T1
    -- result : -\phi*u_1^0
 -- T3
    -- operator : \tilde{\phi}**(-1)
    -- input : \chi*u_0^0 - \phi*u_1^0
    -- dependency : T0 + T2
    -- result : \tilde{\phi}**(-1)*(\chi*u_0^0 - \phi*u_1^0)
 -- T4
    -- operator : \chi
    -- input : u_0^1
    -- dependency : u_0^1
    -- result : \chi*u_0^1
 -- T5
    -- operator : -1
    -- input : u_0^0
    -- dependency : u_0^0
    -- result : -u_0^0
 -- T6
    -- operator : \phi
    -- input : -\tilde{\phi}**(-1)*(\chi*u_0^0 - \phi*u_1^0) - u_1^0
    -- dependency : T1 - T3
    -- result : \phi*(-\tilde{\phi}**(-1)*(\chi*u_0^0 - \phi*u_1^0) - u_1^0)
 -- T7
    -- operator : T_F^C
    -- input : \chi*u_0^1 - \phi*(\tilde{\phi}**(-1)*(\chi*u_0^0 - \phi*u_1^0) + u_1^0)
    -- dependency : T4 + T6
    -- result : T_F^C*(\chi*u_0^1 - \phi*(\tilde{\phi}**(-1)*(\chi*u_0^0 - \phi*u_1^0) + u_1^0))
 -- T8
    -- operator : \tilde{\phi}_C**(-1)
    -- input : T_F^C*(\chi*u_0^1 - \phi*(\tilde{\phi}**(-1)*(\chi*u_0^0 - \phi*u_1^0) + u_1^0))
    -- dependency : T7
    -- result : \tilde{\phi}_C**(-1)*T_F^C*(\chi*u_0^1 - \phi*(\tilde{\phi}**(-1)*(\chi*u_0^0 - \phi*u_1^0) + u_1^0))
 -- T9
    -- operator : T_C^F
    -- input : \tilde{\phi}_C**(-1)*T_F^C*(\chi*u_0^1 - \phi*(\tilde{\phi}**(-1)*(\chi*u_0^0 - \phi*u_1^0) + u_1^0))
    -- dependency : T8
    -- result : T_C^F*\tilde{\phi}_C**(-1)*T_F^C*(\chi*u_0^1 - \phi*(\tilde{\phi}**(-1)*(\chi*u_0^0 - \phi*u_1^0) + u_1^0))

This seems to be correct : in particular, there is only one task associated to each the "expensive" operators $\tilde{\phi}_C^{-1}$ and $\tilde{\phi}^{-1}$

Not sure how it compare to your solution in terms of speed, but at least we get also the tasks in the same run. Only thing that is missing is the replacement of all $u_0^k$ by $u_0^0$ and the symbolic simplification that follows (but this should be done only for the first block, and before doing any factorization ...)

JensHahne commented 1 year ago

I really like this, looks great. The idea is different than before, so there are probably some pitfalls. But I will incorporate it tomorrow and add the last missing things.

JensHahne commented 1 year ago

https://github.com/Parallel-in-Time/time4apint/blob/gaia/output/01_SIAM-CSE22/img/PFASSTTaskGraph.pdf

There is a first version of the updated task graph. I know, there are still multiple thing to improve. This is only a first version to show that it seems to work using your approach.