Addition of the Loss derived type and of the MSE loss function

modern-fortran / neural-fortran

A parallel framework for deep learning

MIT License

395 stars 82 forks source link

Addition of the Loss derived type and of the MSE loss function #175

Closed jvdp1 closed 4 months ago

jvdp1 commented 4 months ago

As discussed with @milancurcic in #173 :

addition of a derived type for loss functions
addition of the Mean Square Error loss function

TODO:

[x] Addition of docs
[x] Addition of tests

jvdp1 commented 4 months ago

@milancurcic what is the strategy for the tests?

milancurcic commented 4 months ago

Thanks, @jvdp1, I'll start a test program.

milancurcic commented 4 months ago

@jvdp1 I put a few very minimal tests that check the expected values given simple inputs. Feel free to add if you can think of better tests. I've been also thinking about how we can test for the integration of these loss functions with the network; perhaps also using simple inputs and known outputs, but pass them through the network type.

jvdp1 commented 4 months ago

Feel free to add if you can think of better tests.

Thank you. These tests LGTM.

perhaps also using simple inputs and known outputs, but pass them through the network type. It could be a possibility. But I guess this will be more to test their support in the implementation than the functions themself. If so, would such tests be more appropriate in e.g., test_dense_network.f90?

milancurcic commented 4 months ago

On second thought, let's wait on testing the integration with the network (regardless of where those tests would be defined). As we implemented general mechanisms to specify and use losses and optimizers, it's become apparent to me the important to separate model creation (i.e. via the network_from_layers constructor) from the "compilation", as it's done in more mature Python frameworks (e.g. in Keras you create the model first by specifying the architecture, and then in a separate step "compile" the model by passing it the loss function, the optimizer, and the eval metrics to use; this allows for example reusing the same network instance with different optimizers/losses, etc.).

I'll merge this and open a separate issue. Thank you for the PR!