Open nmaarnio opened 9 months ago
@msmiyels , could you also help by reviewing this at some point? Especially since I've now touched and modified code by @rajuranpe , I don't want to merge this tool before we are sure it's correct and aligned with the toolkit
Ahoi,
@nmaarnio, I'll start the EIS review and CLI stuff this week. Can also have a look into the autoencoder, however, I do not have detailed experience with it. Would also set this to the lowest priority compared with the other open tasks, if that's okay
Something to consider if you are venturing into neural network space:-
If this is aimed at qgis users as a general demographic - so lots of windows use.
Generic requirements install of tensorflow example will work - but it will get you cpu only tensorflow unless you specifically design it otherwise.
@RichardScottOZ Ya, we know that. The toolkit/plugin is designed in a way everyone can use (and install) it with ease (more or less 😉) . Things like GPU support 🚀 were not considered 🛑 since they make the whole thing way more complicated, especially the installation.
You may know that the GPU stuff is strongly dependend on gpu, system/architecture, drivers and software versioning. The NN-related functions should work quite well on CPU-only, too. I'm not at 100% sure about that for autoencoders, but those are - at least currently - not the typical method used in MPM and if added to the toolkit, more an "experimental" kind of thing.
Regarding the "common" MPM-related, NN-based methods like ANN, we do not expect to have a major disadvantage by not using the GPU. Of course, will be slower, but still reasonable in terms of computing times for this purpose.
On use of autoencoders - here's a pretty typical example I used for something a few years ago:-
https://github.com/RichardScottOZ/hyperspectral-autoencoders
adapted to look at minerals of course, not bricks and asphlat
@nmaarnio
Sorry, I didn't notice this.
The regular autoencoder had
input_shape
parameter and unetresolution
. Was this on purpose or could both use same parameter type?
These two are the same things, I just seem to have made them a bit different for the U-net and non-U-net.
The default value for
batch_size
is quite different from what we have for MLP. Just checking if 128 is a good default value for this one?
As these Autoencoders are often trained on images (usually clipped from larger images), I think 32 or 64 batch size is better, to make sure it can be run on less powerful computers.
For MLP we parameterized activation function, last activation function, optimizer and learning rate for the optimizer. Do we use always the choices that have been defined now in code for this autoencoder or should these be parameterized?
These should be parametrized, yes. Maybe use the currently defined ones as the default values.
Is
regularization
important for these autoencoders? I remember that we decided to leave them out for the MLP implementation for some reason
Usually regularization is beneficial to prevent overfitting, here it's done with L2 and dropout, which can be controlled by the user. L2 regularization penalizes the square values of the weights, helping to keep them small, while dropout randomly sets a fraction of input units to zero during training, which helps in making the model robust to noise and variations in input data. I think generally the results do improve when using those two, as they add noise to the data, enforcing it to work better with limited resources and generalize to not just the training data. L2 regularization can be parameterized to be replaced with L1 regularization as well. The values that dropout and L2 get were already parameterized, but should have a default value.
Could the
modality
parameter be inferred from the input data or from another parameter? If not, should it be tested somehow that the given number of modalities is correct?
The modality is the number of bands, i.e. one can get it from the input data (size of the third axis in images, image.shape[2]), so as an exaple with RGB images of shape [512, 512, 3] you would get 3, but here we use bands.
Do you think the names of the functions are OK? I am trying to recall some discussion where we talked about "segmentation autoencoder" or something similar, but I might be wrong. Related to this, could you briefly describe for what tasks and when are these autoencoders potentially useful?
The names seem ok.
Autoencoders are unsupervised learning, so quite different from models like MLP but more like clustering i.e. autoencoders during training don't receive ground truth labels but instead only the raw data, and a trained model can then be used with other models or to inspect the bottleneck layer features, as autoencoders reduce the number of dimensions in the data to represent them in the smallest possible number of features. This effectively captures the most important features of the data, helping in dimensionality reduction to inspect underlying patterns in the data or to be used for other models, like MLPs, CNNs or segmentation models to capture more broad features. So they take the input data, compress it to a small number of features/parameters and then reconstruct the data from said compressed features, which can also be used in anomaly detection, since the reconstruction will perform poorly when encountering data that differs greatly from the features of the general, "normal" data.
I am having a little bit difficulty figuring out if this autoencoder wants somehow different input data as MLP. Could you check how we prepare and take data in the current MLP implementation and in
machine_learning_general.py
module and see if the same data preparation would suffice for autoencoder?
It is true that MLP wants different kind of data, since autoencoder is unsupervised learning i.e. it does not receive ground truth labels but instead only the raw data, i.e. it works like clustering and a trained model can then be used with other models or to inspect the bottleneck layer features. Therefore it can be inputted just the data, this is convolutional autoencoder so it wants images, and they can be either cut into correct sizes or resized to have the same shape that the autoencoder takes as input (as input shape like 1024 can be quite demanding). The inputs and parametrized model shapes should be powers of two, especially since U-net is hard to make to work when not getting even filter numbers, and one needs to use padding etc.
Thanks for the comments and guidance @rajuranpe ! No problem with the delay, it happens.
@rajuranpe , could you help us by reviewing the new autoencoder implementation? It should do what your original implementation was doing, I just tried to refactor it to fit to the toolkit better. You can find the new implementation in
autoencoder_new.py
. I would also have some questions about the parameters. Some of these might be stupid questions, sorry.input_shape
parameter and unetresolution
. Was this on purpose or could both use same parameter type?batch_size
is quite different from what we have for MLP. Just checking if 128 is a good default value for this one?regularization
important for these autoencoders? I remember that we decided to leave them out for the MLP implementation for some reasonmodality
parameter be inferred from the input data or from another parameter? If not, should it be tested somehow that the given number of modalities is correct?machine_learning_general.py
module and see if the same data preparation would suffice for autoencoder?I have included some "additional" stuff in
autoencoder_utils.py
right now, but the plan is to not include them when merging this. However, if some of these are seen useful, they could be integrated tomachine_learning_general.py
.When the implementation is about ready, I will add some unit tests.