Open tyralla opened 1 year ago
We should also think about the requirements:
@HenningOp and I discussed the possibility of wrapping the LSTM implementation of Tensorflow/Keras into HydPy. Tensorflow seems to be the current standard tool for the given purpose, and Keras provides a more convenient API for it.
Tensorflow is available on PyPI, which is good news, as we only support installing HydPy via pip and not via conda.
HydPy should always work under all active Python versions, currently 3.8, 3.9, 3.10, and 3.11. The latest Tensorflow version (2.13.0) is available for exactly these Python versions. I am not aware of any general promises regarding the supported Python versions, but I do not expect Tensorflow to start dropping active Python versions soon.
PyPI provides "win", "macosx", and "manylinux" Tensorflow wheels, so we could still use HydPy under Linux and Windows after wrapping Tensorflow.
Tensorflow uses the Apache License 2.0, whereas HydPy uses the GNU Lesser General Public License 3.0. The Apache License seems more permissive, so it should allow us to pack Tensorflow into our HydPy installer. However, someone should check this more carefully.
johnnydep told me that tensorflow-intel is Tensorflow's only dependency. However, when installing Tensorflow in my virtual environment for developing HydPy, pip also installed or changed the following packages:
For installing tensorflow-intel
, pip needed to download 276.6 MB. So, adding Tensorflow to the HydPy installer (if it works at all) would vastly increase its size (currently about 100 MB).
Installing Tensorflow downgraded numpy from version 1.25.0 to 1.24.3, as its requirements.txt files specify dependency versions via "==". I would not be surprised if Tensorflow actually requires particular numpy versions due to its high level of performance optimisation. So this could mean trouble when working with other libraries that rely on different specific numpy versions (possibly the case for arcpy
, which wishes 1.20.1?).
We could not think of serious programming difficulties wrapping a Tensorflow-LSTM in a HydPy model. My gut feeling is that this would require similar work to implementing the LSTM equations from scratch. The main advantage of the wrapping approach would be that we could easily add further convenience functions to the HydPy model that rely on other TensorFlow functionalities, most notably the training algorithms. On the downside, the huge increase in dependencies (with specific version requirements) might result in a long-term rise in maintenance costs.
We thought about implementing long short-term memory (LSTM) in HydPy, which currently seems to be the most promising artificial neural network type for some hydrological applications (e.g. flood forecasting). I open this issue as a notebook for the following deeper discussions. The key questions are: