Open lukasm91 opened 5 years ago
I am not sure about this restriction. We identified cases where we need indirection in the vertical access, where on the LHS you need to write even into a runtime offset cell. That would require allowing the left Expr even being a VarAccess for example. I would even dare to say that the interval partitioners and data dependency analyses of dawn are able to deal with it (not sure 100% though in all the cases), when the offset is like k+1. The runtime offset of course we are not ready yet.
Even if some optimizers are not able to deal with it, I would probably not impose the restriction on the definition of the SIR node, but rather on a pass that checks all the expressions. For example we could implement a pass LHSExprChecker that is active when other optimizers that are incompatible with u[k+1] = exprs are also active.
I think we also had the discussion in GridTools (with @havogt and I guess also Anton), and any write to a k-offset you can usually rewrite such that you don't have k-offsets, at least we were able to do that for the cosmo dycore.
There are several problems when you allow write to k-offsets. First, caching gets way harder because you cannot look at the vertical regions anymore to see what cache size you need to allocate. Second, I see an inconsistency if we allow write with non-zero vertical offsets (because this cannot be done with horizontal). People might be tempted to write to non-zero horizontal offsets too. Third, GT4Py does not support this right now.
Regarding where to put the restriction, why wouldn't you put the restriction on the SIR level? If we can easily do it, the SIR should represent the interface as close as possible to enforce compatibility as much as possible. Checks will throw errors, a restrictive protobuf definition will enforce compatibility by design.
@Stagno was also involved when discussing this.
I think for now we don't have the use-case and therefore such a protection would be good in my opinion, if it is simple to do. Longer term I would go for changing the SIR, concerning this special handling of k, by making k addressing explicit for the user.
I have no strong opinion for what concerns offset vertical accesses, because I don't have enough background information. But for horizontal accesses it is clear to me that we would greatly benefit from imposing a centered write access on the LHS: extents computation would be much simpler and also manipulating the AST, in general, would be easier if LHS is always a center. This should be reflected in the interface definition (protobuf and AST classes), because a runtime check in this case makes the intended interface and the exposed one diverge.
few considerations here:
To answer @lukasm91, I agree that kcache is way harder for example. I dont know what pieces of the toolchain could deal with this, but we could disable all the optimizations (including kcaches) and most likely be able to support this functionality. Then we could discuss which optimization we would like to support for this pattern, and for example give it a try to kcaches and see how we could make it work, by looking at the generated code. That is why I think if this is a case we need to support, we should disable optimizations and not completely disabled the pattern in the SIR. If there is no use case, I agree we can disable it at the top level.
I will add another task to figure out first if we need to support it or not and put a dependency here
and if there are, whether it is possible to revert the formulation to a center based scheme (like in COSMO)
I am wondering whether this can be done automatically (more or less) easily by just shifting intervals properly. Simple example:
vertical_region(10, 20) {
w[k+1] = u[k];
}
-->
vertical_region(11, 21) {
w = u[k-1];
}
This is a trivial example of course, but if we could do this in general, we don't need any workarounds. It would still be the question if dawn should do that, but it could be done. I don't like the idea to adding features for which we disable certain optimizations because it becomes very hard for the user (and for us too) to understand what's going on.
It is an interesting idea, but not a trivial one. When the RHS is just input data, it is, but when not, you need to make sure that the RHS is already computed when moving it into a different vertical region.
I don't like the idea to adding features for which we disable certain optimizations because it becomes very hard for the user (and for us too) to understand what's going on.
I understand that. You could see it from a different angle, many of the passes, caches, temporaryToStencilFunction do scan of all write to temporaries and impose a required condition in order to apply the algorithm to that temporary. If it is not met it will skip it. In this case would be no different, if the pass detects a LHS with no centered access it should skip it
I tend to agree that we should not write to offsets (if there is no use-case). The example which get's really hairy, is if we start discussing indirection in the vertical, where we have no means to protect against races if we allow for indirection on the LHS.
We should also consider this in the light of impact on the compute model. I fear that this might add quite some complexity on the parallel model because it allows other kinds of data races.
I think we should start making our SIR more restrictive. Currently, we can write to offsets, which we should not do (also GT4Py does not support this).
Currently there is one use-case in "clang-gridtools", which should be easy to fix.
The goal here is to fix the left hand side of an AssignmentExpr:
left should be something like
oneof {LiteralAccessExpr literal, string field_name}
.Steps: