Register ActNorm.initialized as buffer

This PR changes the behavior of ActNorm by registering the flag initialized as a buffer. This fixes #4: saving a state dict, loading it, and resuming training will no longer re-initialize the ActNorm layers.

This PR also adds a test for the consistency of ActNorm layers when saving and loading a state dict.

Warning: this PR breaks backwards compatibility with old saved state dicts (or at least will require some workaround when loading an old state dict).