kronenthaler / libai

4 stars 2 forks source link

Changes in libai.common.functions #15

Closed dktcoding closed 7 years ago

dktcoding commented 7 years ago

I have a couple of proposed changes for the libai.common.functions package:

Function

For the Function interface the idea is change Function#getDerivative() for Function#evalDerivative(double) and simply return Double.NaN for functions with no derivative (eg. Sign, SymmetricSign).

Given that Function#getDerivative() is only used (at least currently) in MLP this should not yield any mayor inconvenience, particularly since we never use a derivative of a grade higher than 1. The idea is to cleanup the the Function code in order to make it more readable (see alternative).

Function (Alternative)

Since Functions are used in other places (like Matrix) and they are actually useful for implementing things like custom random number generators and such (at least that's the use I gave them). Why not create 2 interfaces:

that way we don't loose any functionality, all the function's code will still be in one class and we clarify the use of Function in cases like Matrix#apply(Function) where a derivative makes no sense.

I think I like this alternative much more than the original one...

Adding other activation functions

Gaussian and Sinc work great as activation functions for MLP particularly when training in data with a finite number of discontinuities.

So, besides Gaussian and Sinc I was planning on adding most of the common activation functions

Adding random initializers

This one is more of a personal matter, I've been implementing my random matrix initializers using Matrix#apply(Function) (that's actually the main reason for this proposal), it will be nice to have them inside libai, but can actually live without this.

Soooooooooo, what do you think? (yeah, I really enjoy writing long posts hehe)

kronenthaler commented 7 years ago

From the mathematical point of view, FunctionDerivative makes no sense as a function derivative is a function as well. The rationale of returning a function from the getDerivative is to be able to use it as function type for something else. It provides more consistent typing across the application.

It's true that we don't use (currently) derivatives deeper than the 1 degree, but it doesn't mean we won't in the future. Also, it's mathematically correct that if a function doesn't have a first derivative, it returns null because the first derivative doesn't exist.

So, to sum up, i rather not to change that, as the current is the most mathematically correct and consistent way. However, i'm fine to create new functions to generate random numbers and stuff like that, in such cases, throw an UnsupportedOperationException on the getDerivative or return null should be enough.

dktcoding commented 7 years ago

Actually if FunctionDerivative is extending from Function it will still be a Function in itself, but I get your point.

I'll leave them as they are and send a pull request for the "extra" functions and random generators once I clean them up.