Implementing basic RNN - Githubissues

modern-fortran / neural-fortran

A parallel framework for deep learning

MIT License

409 stars 85 forks source link

Implementing basic RNN #162

Open castelao opened 1 year ago

castelao commented 1 year ago

A work in progress. I'm mostly interested in loading a TF model from HDF and applying predict(), but I'll do my best in doing a complete implementation coherent with the rest of the library.

milancurcic commented 1 year ago

Amazing, thanks and great to see you, @castelao, after so many years. 🙂

Will review this coming week.

castelao commented 1 year ago

@milancurcic , yes, it's great to connect again. Thank you and the other developers for your time in this library! It is great.

I'm not fluent in modern Fortran, so if you see anything that doesn't make sense, please let me know. And be aware, it is still a WIP.

milancurcic commented 11 months ago

I added the support for rnn_layer to network % predict so that we can run the simple_rnn example. However I can't get the simple example to converge and I'm not very familiar with recurrent networks. @castelao is there a small toy example like the one in simple.f90 that's known to converge and fast? We can then use it for testing as well. With a working example and a small test suite (I can add this), this PR is almost good to go.

castelao commented 5 months ago

@milancurcic , I have to check how you updated the library since the last time I worked on this and see if what I did is still consistent. If it looks fine, I intend to work on the following:

The toy example
Load functionality from an HDF exported by TensorFlow. I want to avoid recompiling every time I change my coefficients.

Are there any other requirements before I can submit this PR for review? It will be slow progress, but I'm back to this.

milancurcic commented 5 months ago

@castelao Thanks for all the additions. Sounds good.

It may be helpful to merge main into here, which would reflect the few recent changes in other layers that @jvdp1 mentioned.
Regarding loading recurrent layers from Keras HDF5 files; unless you need it, I'd say it's not a priority. If you need it, let's add it in a later PR, as there would be considerable additions needed to make it work (e.g. a Keras script example in neural-fortran/keras-snippets). That format is also no longer the latest Keras saved model format (they change too often!). Related but not part of this PR, I'd like to find an easy way to make this part (and HDF5 dependency) optional at build time. Probably via preprocessor flags in the code, but I'd need to read up how to do this in fpm.toml.
But a standalone example program in examples/ is important.

castelao commented 4 months ago

@jvdp1 , thanks for your suggestions. I'll work on that.

@milancurcic , yes, I have already rebased it with main. I'm interested in the loading from Keras output, but I see your point. I'll leave that loading capability for another PR, but I'll work on the example for this one. Thanks!