xarray-contrib / xarray-simlab

Xarray extension and framework for computer model simulations
http://xarray-simlab.readthedocs.io
BSD 3-Clause "New" or "Revised" License
73 stars 9 forks source link

cyclic foreign vars #177

Closed feefladder closed 3 years ago

feefladder commented 3 years ago
feefladder commented 3 years ago

proof-of-concept PR for allowing user-defined custom dependencies, still needs:

the arrows from inout still also show up if the process that uses the inout var is set to explicitly depend on it

feefladder commented 3 years ago

So, we probably still have to move the check to a place where we can actually view the model.

For the checking, I used a dictionary of {p_name:{dep1,de2,dep3}}, that is also very useful for any other graph inspection. Do you think it is a small enough datastructure to be kept? (this allows for very concise/faster code in other functions.

benbovy commented 3 years ago

graph reduction (now) heavily depends on the deps_dict (and is really just a few lines), so I don't think that shoudl be split.

Besides the implementation, we still have to decide how to expose this feature, whether to enable it or not by default, etc.

Also, it's not yet very clear to me what is the motivation behind the dashed arrows for inout variables.

Honestly, it's too much for me to follow and discuss in a single PR.

Adding user-defined dependencies is a already a big feature that has a high potential to make the framework more flexible, so I'd really appreciate if we could progress incrementally to make sure that we get the design right and the feature robust.

jvail commented 3 years ago

Also, it's not yet very clear to me what is the motivation behind the dashed arrows for inout variables.

I feel guilty having started this particular feature discussion :) So let me give a practical example from the model I am working on (it's a bit messy right now - but work-in-progress):

model-graph

The process topology (and others) depend on the result of the arch_dev process (t minus one). And imho it would be a good improvement if that could be visualized somehow.

feefladder commented 3 years ago

graph reduction (now) heavily depends on the deps_dict (and is really just a few lines), so I don't think that shoudl be split.

Besides the implementation, we still have to decide how to expose this feature, whether to enable it or not by default, etc.

Also, it's not yet very clear to me what is the motivation behind the dashed arrows for inout variables.

Honestly, it's too much for me to follow and discuss in a single PR.

Adding user-defined dependencies is a already a big feature that has a high potential to make the framework more flexible, so I'd really appreciate if we could progress incrementally to make sure that we get the design right and the feature robust.

Ok. I get that, I can move the reduction to another PR, as well as maybe adding custom dependencies. Then we have:

This will quite test my Git skills....

@jvail no worries, the main part of this PR is about enabling (and verifying) inout variables, that was already an issue. The dotted arrows part will need some more work and is actually quite complex, not sure if I'll find time for that

benbovy commented 3 years ago

Thanks for the illustration @jvail.

I was getting confused because we (I feel guilty too) involve two separate problems in the discussion here and in #164:

  1. we have a quantity that we want to use and possibly update during the same timestep (or simulation stage) in one or more processes executed with some specific constraints on the workflow.
  2. we have a quantity a which depends on quantity b, but b depends on a computed at the previous timestep or simulation stage.

Problem 1 is what I thought @Joeperdefloep you describe in your https://github.com/xarray-contrib/xarray-simlab/issues/164#issuecomment-802873790 (as you mentioned Fastscape's SurfaceToErode which is a workaround to that specific problem). The introduction of user-defined process dependencies (+ allow intent='inout' for foreign variables) would solve this problem.

Problem 2 is what @jvail you initially describe in #164. The current workaround is to declare an inout variable for a, define an input value as the initial value at t = -1, and make sure that it is not updated during the same step/stage than when it is accessed to compute b. However, this approach is a bit fragile. With the introduction of user-defined process dependencies, you will instead need to provide an explicit dependency when building your model so that you can safely compute b before updating a during the same step/stage.

That said, for problem 2 you'll still need an inout variable (+ input value) for a. If you want to define a with intent='out', then the cleanest way to do this is to introduce a whole new concept of a 3rd dimension (or some kind of step/stage unroll) for the graph of processes in xarray-simlab (cf. your https://github.com/xarray-contrib/xarray-simlab/issues/164#issuecomment-789414007). I don't think it's worth the extra complexity, though (the intent concept is also invariant of the simulation stage). Alternatively, we could add an exception to the variable intent heuristics (cf. declare a variable as seemingly_cyclic) but I don't really see much advantage of this approach compared to the inout + input value workaround.

feefladder commented 3 years ago

@jvail the difficulty: imagine the graph:

  inout1->inout2->inout3
   /   \ /         \
 in1   in2          in3

then, we want arrows only from inout3 to in1, so we need some sort of sorting algorithm for that. For all inout variables. That is quite the algorithm.

feefladder commented 3 years ago

@benbovy I think I will close this PR until I have properly split it

benbovy commented 3 years ago

Yeah sorry @Joeperdefloep a fresh start in new branches is probably the best thing to do.

What we need first is a PR that

In the same PR or the next one:

Then in follow-up PRs: