Closed Zettelkasten closed 3 years ago
Then,
split
should also be optimized out as an output layer.
Yes.
This is not the case however, because:
split/0
andsplit/1
havecollocate_with=["split"]
...
Hm, I think our whole logic with sub layers is a bit buggy, or not even well defined.
For the case of subnetwork sub layers, they are all completely independent (and the main subnet layer is just the same as the output
sub layer of it).
Otherwise (basically all other layers which have sub layers), they all belong together. Like in this case for SplitLayer
.
Probably this second case is buggy.
I remember that @patrick-wilken did some changes/fixes recently on this?
(Actually, in my use case, I don't even have a layer
input
that recurrently depends on itself. But instead, I use a:i
layer, which currently never is optimized out as an input layer:if layer.name in [":i", "end"]: # currently not fully implemented return False
Somewhat related - what is missing of the implementation there?)
Let's keep this as a separate issue, and not mix them up here. This is really unrelated / orthogonal.
The layer responsible for :i
(RecStepInfoLayer
or sth like that) is just not implemented (as far as I remember).
But also, it should not matter much, as this optimization should really be minor (it would use tf.range
or so outside but I don't think that the tf.while_loop
for :i
would have any measurable effect). And any layers who depend on :i
could still be moved out, even if :i
itself is not moved out. Because :i
itself depends on nothing.
however,
split
cannot be moved out because there is an dependency onsplit/1
Is this dependency the right direction? Does this dependency makes sense? I would expect that split/1
depends on split
. Maybe that is the real problem here, and with that fixed, it would already work?
however,
split
cannot be moved out because there is an dependency onsplit/1
Is this dependency the right direction? Does this dependency makes sense? I would expect that
split/1
depends onsplit
. Maybe that is the real problem here, and with that fixed, it would already work?
Äh my bad, split/1
depends on split
. But that means that if split/1
is not optimized out (yet), split
should also not be optimized out (as output layer).
Otherwise (basically all other layers which have sub layers), they all belong together. Like in this case for
SplitLayer
.Probably this second case is buggy.
As far as I understand, the layers are kept together by setting collocate_with
of all sub layers to the parent.
If there is only one sub layer, then this was already handled by the logic:
When checking if it should optimize the sub layer sub
(here: split/0
), it will check if it can optimize out the parent layer if it ignores the dependency to sub
.
When there is another layer sub2
(here: split/1
), that is also collocated to the parent, this fails because the dependency to sub2
is not ignored by this check.
So in #559, I updated the logic to not only ignore sub
, but also all other layers that should be collocated to the parent.
[regarding
:i
...] Let's keep this as a separate issue, and not mix them up here. This is really unrelated / orthogonal. [..] And any layers who depend on:i
could still be moved out, even if:i
itself is not moved out. Because:i
itself depends on nothing.
Yes totally, layers that depend on :i
of course can be moved out, but they can only be moved out as output layers then. In my case, that does not work, because this logic is buggy right now for SplitLayer
. So adding optimize out for the :i
layer would act as a workaround for this issue then. (But yes, let's keep this as an independent issue. And let's try to fix this properly of course.)
Hm, so we have both the collocate_with
logic and the dependencies
logic for sub layers. Maybe we do not need the dependencies
logic, and the collocate_with
logic covers everything what we need, and maybe then the current logic already works?
Hm, so we have both the
collocate_with
logic and thedependencies
logic for sub layers. Maybe we do not need thedependencies
logic, and thecollocate_with
logic covers everything what we need, and maybe then the current logic already works?
Hm, yea. I tried it out, but it does not really change anything. It still isn't able to optimize out SplitLayer
in my test case.
In general, we can change that "a collocated with b" implies "a does not depend on b", but that's orthogonal to this issue (and I don't think it's very useful). It also does not make anything easier in the logic, as we can also check for this explicitly in the logic if we want to, e.g. check for "a depends on b and a is not collocated with b" instead of "a depends on b".
In general, we can change that "a collocated with b" implies "a does not depend on b", but that's orthogonal to this issue (and I don't think it's very useful).
No, I think this doesn't make sense. collocate_with
should be orthogonal to the dependency logic.
But my earlier comment on collocation was wrong. Collocation is not directed, unlike dependency. It goes both ways (even if the flag in config / layer attribute is directed).
But also see my other comments in the PR.
Consider some net dict of a rec unit like this (see #559 for a full test case):
Then,
split
should also be optimized out as an output layer. This is not the case however, because:split/0
andsplit/1
havecollocate_with=["split"]
set by_TemplateLayer.get_sub_layer
, but also an dependency on their parent layersplit
.E.g.
split/0
is not optimized out because the logic ofoutput_can_move_out(split/0)
does the following:split
can be moved out with_collocated={split/0}
split
cannot be moved out becausesplit/1
depends on it. Andsplit/1
has not been moved out yet. [edited.]Symmetrically,
split/1
is also not optimized out.(Actually, in my use case, I don't even have a layer
input
that recurrently depends on itself. But instead, I use a:i
layer, which currently never is optimized out as an input layer:Somewhat related - what is missing of the implementation there?)