Open speglich opened 3 years ago
So thinking about it, these arguments make sense for that MFE, in the reverse mode, the last computed element is p[time_m -1]
so it has to be time_m=1
to be valid.
In the forward mode, the first computed is p[1]
and last is p[6]
In the reverse mode, the first computed is p[4]
and last is p[0]
So I think with time=6
in reverse mode should be adjoints, maybe would be good to add a source to make sure the automatic bounds are correct.
Fixed...
We think that is not only a question of increase the range on reverse mode. Increasing the range doesn't allow us to guaranty that the memory accessed will be the same.
This example show us what we trying to explain when we are using the reverse mode:
for (int time = time_M, t2 = (time + 1)%(2), t0 = (time)%(2); time >= time_m; time -= 1, t2 = (time + 1)%(2), t0 = (time)%(2))
Considering that the time range initially is from 1
to 6
and I increase the range to 7
, instead to change the range from 0
to 6
, t0 will be 1
when time=7
instead of 0
when time=0
, resulting in differents computed values.
for (int time = time_M, t2 = (time + 1)%(2), t0 = (time)%(2); time >= time_m; time -= 1, t2 = (time + 1)%(2), t0 = (time)%(2))
this is the loop for the reverse mode the one you are considering is for the forward.
But, anyway, is this analysis valid? Make sense?
This is what happens on forward iterations. Last p
calculated is p[0]
.
Forward | m | M | ||||
---|---|---|---|---|---|---|
time | 0 | 1 | 2 | 3 | 4 | 5 |
t0 | 0 | 1 | 0 | 1 | 0 | 1 |
t1 | 1 | 0 | 1 | 0 | 1 | 0 |
Now moving to backward marching. If time range is 5 to 0, the first p read is p[1]
which is wrong. Anyway, I don't think the last write would be invalid, since we actually write in p[(time+1)%2]
, which is 0 in case of time_m=1
and 1 in case of time_m=0
.
Backward | M | m | ||||
---|---|---|---|---|---|---|
time | 5 | 4 | 3 | 2 | 1 | 0 |
t0 | 1 | 0 | 1 | 0 | 1 | 0 |
t2 | 0 | 1 | 0 | 1 | 0 | 1 |
But if we march from 6 to 1, as @mloubout says, then we get p[0]
as our first read, which is the last p
we write on forward marching, which I think is the right thing to do.
Backward | M | m | ||||
---|---|---|---|---|---|---|
time | 6 | 5 | 4 | 3 | 2 | 1 |
t0 | 0 | 1 | 0 | 1 | 0 | 1 |
t2 | 1 | 0 | 1 | 0 | 1 | 0 |
The problem with this approach is that it's working with this MFE, but another examples are raising an exception:
File "/home/victor.leite/lde/devito-issues/devito/types/dimension.py", line 290, in _arg_check
raise InvalidArgument("OOB detected due to %s=%d" % (self.max_name,
devito.exceptions.InvalidArgument: OOB detected due to time_M=246
So I'm checking that also..
Ok, so the problem is when we use other functions that are accessed with the original time
index (e.g. source or receiver functions). On these arrays, if we use time_M >= time_size
we try to access out of bound indices.
So, if we need the last time step from a function previously calculated with .forward
, and also we use a fixed time size function at the second operator, we need one more adjustment:
import numpy, random
from devito import Grid, TimeFunction, Eq, Operator, Buffer
grid = Grid(shape=(2,))
p = TimeFunction(name='p', grid=grid, space_order=0, dtype=numpy.float64, save=Buffer(2))
s = TimeFunction(name='s', grid=grid, space_order=0, dtype=numpy.float64, save=11)
s.data[:] = [ [random.random()] for n in range(11) ]
# FORWARD
op1 = Operator(Eq(p.forward, p + s))
op1.apply(time=10)
# BACKWARD
# 1) Numerical error
# - Don't get the last p and s
# - Execute less time steps (time_m=1)
# op2 = Operator(Eq(p.backward, p - s))
# op2.apply(time=10)
# 2) Exception
# - s does not have s[11] position
# op2 = Operator(Eq(p.backward, p - s))
# op2.apply(time=11)
# 3) OK
# Read s with right index
op2 = Operator(Eq(p.backward, p - s.backward))
op2.apply(time=11)
# 4) OK
# Avoid time_m=1 and read p with right index
# op2 = Operator(Eq(p, p.forward - s))
# op2.apply(time=10)
assert (numpy.all( abs(p.data[0]) < 1e-15))
Assuming the user needs to be aware of time buffer behavior (especially about that implicit time_m
shift) I think we could close this issue, since it isn't a bug but an expected behavior, right @FabioLuporini @mloubout ?
Anyway I guess that behavior could be turned into something more user friendly.
Hello folks,
As noticed by @nogueirapeterson, we are experiencing numeric precision errors with
np.float64
in adjoint tests, he introduced this errors on Slack (here) and now we figured out what’s going on.The error isn’t connected with the precision used or with the equations implemented (e.g. acoustic self-adjoint), but with the time arguments (
time_m
andtime_M
) that is computed for forward and backward operators. The difference betweentime_m
andtime_M
are the cause of this precision error, that only could be noticed when we usednp.float64
.The adjoint test is attached on this issue, however we build a MFE to explain what we found.
In this MFE we could notice that
time_m
parameter is different for both cases, causing different time steps for each operator.The
time_m
andtime_M
values are computed by_prepare_arguments
(if they are not passed by user) and usesdspace
to avoid out of boundary acesses. As we know and expect thedspace
computed in each operator are different, and are perfectly correct, but maybe the way that we construct the time range interval could be improved, once the out of boundary access don’t happen on this time dimensions.adjoint_test.txt