Closed accosmin closed 8 years ago
Reduce the number of parameters from O(#outputs \times #inputs) to O(#outputs).
Some ideas:
Also new variation of the linear layer: use normalized weights for parameters (like described above)
Reduce the number of parameters from O(#outputs \times #inputs) to O(#outputs).
Some ideas: