modern-fortran / neural-fortran

A parallel framework for deep learning
MIT License
404 stars 83 forks source link

Allow setting per-layer activation function #7

Closed milancurcic closed 5 years ago

milancurcic commented 5 years ago

Activation function is currently set globally on the network level. It's beneficial to allow using a different activation functions for some layers by implementing layer % set_activation(). This change will not be API-breaking.

zbeekman commented 5 years ago

Sounds like a strategy pattern could be used, or procedure pointers.

milancurcic commented 5 years ago

See WIP using procedure pointers here:

https://github.com/modern-fortran/neural-fortran/tree/7-allow-activation-per-layer

I believe it should be good to go except that I get an ICE with gcc-8.2.0. I didn't have chance to revisit it yet, but I think it's close to done.

milancurcic commented 5 years ago

Most work done 7-allow-activation-per-layer, and activation functions are now set completely on a layer_type level, with network_type % set_activation() as a thin wrapper to set a function for all layers at once.

gcc-8.2.0 ICE comes up only if associated name is used to reference a nested procedure pointer. For example, this builds and runs OK:

    associate(layers => self % layers)
      ...
      do n = 2, size(layers)
        ...
        layers(n) % a = self % layers(n) % activation(layers(n) % z)
      end do
    end associate

but this triggers an ICE:

    associate(layers => self % layers)
      ...
      do n = 2, size(layers)
        ...
        layers(n) % a = layers(n) % activation(layers(n) % z) ! <-- can't reference procedure pointer component of an instance defined in an associate block
      end do
    end associate

And here's the ICE:

f951: internal compiler error: find_array_spec(): unused as(2)
Please submit a full bug report,
with preprocessed source if appropriate.
See <file:///usr/share/doc/gcc-8/README.Bugs> for instructions.

This is not a biggie. We can just use slightly more verbose code. I will add a few tests and merge. Will also open a separate issue for the ICE and try to reproduce in a minimal example.