chr5tphr / zennit

Zennit is a high-level framework in Python using PyTorch for explaining/exploring neural networks using attribution methods like LRP.
Other
183 stars 33 forks source link

MixedComposite and NameLayerMapComposite for combining arbitrary Composites #160

Closed dkrako closed 1 year ago

dkrako commented 1 year ago

This way it is possible to specify a fallback layer map if we want to apply more general rules to the remaining layers.

The parameter is optional. If None is passed, there's no change in functionality. If a layer_map list is passed with the same format as for the LayerMapComposite, then this will be used as a fallback if there's no hook for this specific layer name.

Just an example, where it's possible to set one Hook for the first two convolutional layers, and a different one for the remaining ones:

cmp = NameMapComposite(
    name_map=[(('conv_1, 'conv_2'), Flat()],
    layer_map=[((zennit.types.Convolution, ), AlphaBeta())],
)

I hope you deem this to be a useful addition to this package.

chr5tphr commented 1 year ago

Hey @dkrako

thank you for the PR! I actually had thought of this feature before, so on that end I think it's definitely worth to have!

Implementation

I originally thought of either adding a new mixing Composite type (e.g. MixedComposite) which can mix arbitrary Composites, or implementing a new Composite type e.g. NameLayerMapComposite, which has the functionality you suggested for the current NameMapComposite.

For simplicity's sake, I think the latter solution may be preferable over changing NameMapComposite. I think it makes sense to keep the vanilla NameMapComposite as a way to have explicit-mapping-only-Composites.

Thinking about the actual fall-backs a user may have, I think instead of using a layer_map, a built-in Composite may be more common fallback candidate, e.g. "what happens if I only change this layer".

With this in mind, I actually tend more towards the MixedComposite solution, as complexity-wise it is actually not so different.

A MixedComposite would then be used like this:

composite = MixedComposite([
    NameMapComposite([(('conv_1, 'conv_2'), Flat())]),
    LayerMapComposite([((zennit.types.Convolution, ), AlphaBeta())])
])

A NameLayerMapComposite could theoretically be implemented using

class NameLayerMapComposite(MixedComposite):
    def __init__(self, name_map=None, layer_map=None, canonizers=None):
        super().__init__([
            NameMapComposite(name_map=name_map),
            LayerMapComposite(layer_map=layer_map),
        ], canonizers=canonizers)

And finally, a draft prototype for MixedComposite would be:

class MixedComposite(Composite):
    def __init__(self, composites, canonizers=None):
        if canonizers is None:
            canonizers = []
        self.composites = composites
        super().__init__(
            module_map=self.mapping,
            canonizers=sum([composite.canonizers for composite in composites], canonizers)
        )

    def mapping(self, ctx, name, child):
        # create a context for each of the sub-composites if ctx is empty
        if not ctx:
            ctx.update({composite: {} for composite in self.composites})

        # evaluate all hooks so all composites can update their context
        hooks = [composite.module_map(ctx[composite], name, child) for composite in self.composite]

        return next((hook for hook in hooks if hook is not None), None)

Requirements

I think it would be better to implement the more general MixedComposite.

The To-Do's associated with this would then be

Would you be willing to complete the implementation and change the PR accordingly?

dkrako commented 1 year ago

Thank you very much for the suggestions. I agree, a general MixedComposite is an even more flexible and elegant approach. I think the name is good and describes what the class is actually doing. One thing to note will be that the list order of the composites corresponds to their "priority".

Regarding your draft prototype, I don't really understand why you are summing up the canonizers. I also don't really understand the purpose of the ctx variable, as it seems to be an empty dict all the time. This actually makes me worry if I'm the right one to do the job accordingly.

As far as I see, the implementation of the two new composite classes doesn't need any more work apart from adding docstrings, right? So what's left to do would be just adding documentation and tests to your draft implementations. I have the feeling that you'd also be better for writing and wording the How-To's, as you have the bigger picture of the usage of your own package.

So all in all it seems that it will be more efficient if you would complete this implementation yourself instead of us writing back and forth. I would still be happy to contribute to this, but I think it will take longer as originally expected.

Let me know what you think of this. I'll then copy your draft to the PR, add documentation, basic tests and a first draft for the How-To's at the end of the week.

chr5tphr commented 1 year ago

Hey @dkrako sorry for the delay.

Regarding your draft prototype, I don't really understand why you are summing up the canonizers. I also don't really understand the purpose of the ctx variable, as it seems to be an empty dict all the time. This actually makes me worry if I'm the right one to do the job accordingly.

I summed them up (they are lists) in the draft so that all canonizers of all composites are used. Since the .register() function is not called for the sub-composites, the canonizers would otherwise not be applied. However, I suspect it may not be a common use-case to specify multiple canonizers through the sub-composites, but rather supply the ones that are really needed directly to MixedComposite.

The ctx variable carries some context, for example, in SpecialFirstLayerMapComposites, it keeps track of whether the first layer has already been seen. It will be initialized to an empty dict in the beginning of Composite.register, and then the same dict will be used in subsequent calls to module_map, so that the function can have a state during the registration-process only. The MixedComposite's ctx simply separates the ctx variables of the sub-composites in its own ctx. One example where it is important all Composites are called and have their own ctx updated is when multiple instance of SpecialFirstLayerMapComposite are used, so that the lower priority instance does not assign its first-layer-rule to the first layer that passes through to it.

As far as I see, the implementation of the two new composite classes doesn't need any more work apart from adding docstrings, right? So what's left to do would be just adding documentation and tests to your draft implementations. I have the feeling that you'd also be better for writing and wording the How-To's, as you have the bigger picture of the usage of your own package.

That's the hope, however, I wrote the draft only quickly and without testing it, so there may come things up that I did not yet think of.

So all in all it seems that it will be more efficient if you would complete this implementation yourself instead of us writing back and forth. I would still be happy to contribute to this, but I think it will take longer as originally expected.

Let me know what you think of this. I'll then copy your draft to the PR, add documentation, basic tests and a first draft for the How-To's at the end of the week.

If you would like to do it and feel up to it, I do not mind writing back and forth. Otherwise I can also finish it up.

dkrako commented 1 year ago

Thanks @chr5tphr, good to hear and nevermind the delay.

The first steps aren't that complex and you'll review the code either way, so I feel up to it and am happy to contribute. I'll also try to draft some addition to the composite tutorial and add some base test cases. The correct setup of ctx should then also be tested, right?

I will tackle this next week as I've got a deadline this friday, so expect to hear from me about the end of next week.

chr5tphr commented 1 year ago

The correct setup of ctx should then also be tested, right?

Yes, that would be good. I would keep it general and maybe verify using a DummyComposite implementation that adds the modules it has seen to its ctx, and afterwards check whether all modules of the model ended up in all sub-composite's ctx.

I will tackle this next week as I've got a deadline this friday, so expect to hear from me about the end of next week.

Thank you, looking forward to the first draft!

dkrako commented 1 year ago

Sorry for the long delay, my schedule was just too tight during the last weeks (but hopefully you will get a new citation soon as I have used your package for LRP attributions in a paper submission)

I did the basic stuff now:

I have already created some fixtures and empty failing test stubs for the two new classes. I hope I manage to finish up the tests until the end of the week.

chr5tphr commented 1 year ago

Hey @dkrako

Sorry for the long delay, my schedule was just too tight during the last weeks (but hopefully you will get a new citation soon as I have used your package for LRP attributions in a paper submission)

No worries, I know how it is, looking forward to reading the paper.

I did the basic stuff now:

* [x]  Implement the MixedComposite in src/zennit/composites.py (with docstrings)

* [x]  Implement the NameLayerMapComposite in src/zennit/composites.py (with docstrings)

* [ ]  Add documentation, ideally a section in [How-To's/Composites after Cooperative LayerMapComposite](https://zennit.readthedocs.io/en/latest/how-to/use-rules-composites-and-canonizers.html#cooperative-layermapcomposites)

* [ ]  Add tests in tests/test_composites.py

I have already created some fixtures and empty failing test stubs for the two new classes. I hope I manage to finish up the tests until the end of the week.

Thanks a lot!

In order to better see the actual additions for easier reviewing, could you, for now:

If you would do git rebase -i origin/master (in case origin points to https://github.com/chr5tphr/zennit.git), your rebase-todo would look like this:

pick cfd2bff Add layer_map argument to NameMapComposite.
squash 0bb6107 apply requested changes and additions to composites.py
pick 9749457 add fixtures for MixedComposite and NameLayerMapComposite
pick a821ce5 add test stubs for MixedComposite and NameLayerMapComposite
dkrako commented 1 year ago

I have added documentation to the class docstrings and the howto's.

I also implemented a first draft of tests for the two new classes. For the NameLayerMapComposite I combined the test cases from test_composite_name_map_registered and test_composite_layer_map_registered.

For the MixedComposite I combined the test cases from test_composite_name_map_registered and test_composite_special_first_layer_map_registered.

Unfortunately the tests for MixedComposite are failing. For now I don't really get why this happens, but I will dig deeper. Somehow there are being epsilon composites used in the fixtures for the test function.

dkrako commented 1 year ago

@chr5tphr I have fixed the test for MixedComposite and now I think this PR should be ready for review.