143 add autoencoders - Githubissues

nmaarnio commented 9 months ago

@rajuranpe , could you help us by reviewing the new autoencoder implementation? It should do what your original implementation was doing, I just tried to refactor it to fit to the toolkit better. You can find the new implementation in autoencoder_new.py. I would also have some questions about the parameters. Some of these might be stupid questions, sorry.

The regular autoencoder had input_shape parameter and unet resolution. Was this on purpose or could both use same parameter type?
The default value for batch_size is quite different from what we have for MLP. Just checking if 128 is a good default value for this one?
For MLP we parameterized activation function, last activation function, optimizer and learning rate for the optimizer. Do we use always the choices that have been defined now in code for this autoencoder or should these be parameterized?
Is regularization important for these autoencoders? I remember that we decided to leave them out for the MLP implementation for some reason
Could the modality parameter be inferred from the input data or from another parameter? If not, should it be tested somehow that the given number of modalities is correct?
Do you think the names of the functions are OK? I am trying to recall some discussion where we talked about "segmentation autoencoder" or something similar, but I might be wrong. Related to this, could you briefly describe for what tasks and when are these autoencoders potentially useful?
I am having a little bit difficulty figuring out if this autoencoder wants somehow different input data as MLP. Could you check how we prepare and take data in the current MLP implementation and in machine_learning_general.pymodule and see if the same data preparation would suffice for autoencoder?

I have included some "additional" stuff in autoencoder_utils.py right now, but the plan is to not include them when merging this. However, if some of these are seen useful, they could be integrated to machine_learning_general.py.

When the implementation is about ready, I will add some unit tests.

nmaarnio commented 9 months ago

@msmiyels , could you also help by reviewing this at some point? Especially since I've now touched and modified code by @rajuranpe , I don't want to merge this tool before we are sure it's correct and aligned with the toolkit

msmiyels commented 9 months ago

Ahoi,

@nmaarnio, I'll start the EIS review and CLI stuff this week. Can also have a look into the autoencoder, however, I do not have detailed experience with it. Would also set this to the lowest priority compared with the other open tasks, if that's okay

RichardScottOZ commented 8 months ago

Something to consider if you are venturing into neural network space:-

If this is aimed at qgis users as a general demographic - so lots of windows use.

Generic requirements install of tensorflow example will work - but it will get you cpu only tensorflow unless you specifically design it otherwise.

msmiyels commented 8 months ago

@RichardScottOZ Ya, we know that. The toolkit/plugin is designed in a way everyone can use (and install) it with ease (more or less 😉) . Things like GPU support 🚀 were not considered 🛑 since they make the whole thing way more complicated, especially the installation.

You may know that the GPU stuff is strongly dependend on gpu, system/architecture, drivers and software versioning. The NN-related functions should work quite well on CPU-only, too. I'm not at 100% sure about that for autoencoders, but those are - at least currently - not the typical method used in MPM and if added to the toolkit, more an "experimental" kind of thing.

Regarding the "common" MPM-related, NN-based methods like ANN, we do not expect to have a major disadvantage by not using the GPU. Of course, will be slower, but still reasonable in terms of computing times for this purpose.

RichardScottOZ commented 8 months ago

On use of autoencoders - here's a pretty typical example I used for something a few years ago:-

https://github.com/RichardScottOZ/hyperspectral-autoencoders

adapted to look at minerals of course, not bricks and asphlat

rajuranpe commented 7 months ago

@nmaarnio

Sorry, I didn't notice this.

The regular autoencoder had input_shape parameter and unet resolution. Was this on purpose or could both use same parameter type?

These two are the same things, I just seem to have made them a bit different for the U-net and non-U-net.

The default value for batch_size is quite different from what we have for MLP. Just checking if 128 is a good default value for this one?

As these Autoencoders are often trained on images (usually clipped from larger images), I think 32 or 64 batch size is better, to make sure it can be run on less powerful computers.

For MLP we parameterized activation function, last activation function, optimizer and learning rate for the optimizer. Do we use always the choices that have been defined now in code for this autoencoder or should these be parameterized?

These should be parametrized, yes. Maybe use the currently defined ones as the default values.

Is regularization important for these autoencoders? I remember that we decided to leave them out for the MLP implementation for some reason

Usually regularization is beneficial to prevent overfitting, here it's done with L2 and dropout, which can be controlled by the user. L2 regularization penalizes the square values of the weights, helping to keep them small, while dropout randomly sets a fraction of input units to zero during training, which helps in making the model robust to noise and variations in input data. I think generally the results do improve when using those two, as they add noise to the data, enforcing it to work better with limited resources and generalize to not just the training data. L2 regularization can be parameterized to be replaced with L1 regularization as well. The values that dropout and L2 get were already parameterized, but should have a default value.

Could the modality parameter be inferred from the input data or from another parameter? If not, should it be tested somehow that the given number of modalities is correct?

The modality is the number of bands, i.e. one can get it from the input data (size of the third axis in images, image.shape[2]), so as an exaple with RGB images of shape [512, 512, 3] you would get 3, but here we use bands.

Do you think the names of the functions are OK? I am trying to recall some discussion where we talked about "segmentation autoencoder" or something similar, but I might be wrong. Related to this, could you briefly describe for what tasks and when are these autoencoders potentially useful?

The names seem ok.

Autoencoders are unsupervised learning, so quite different from models like MLP but more like clustering i.e. autoencoders during training don't receive ground truth labels but instead only the raw data, and a trained model can then be used with other models or to inspect the bottleneck layer features, as autoencoders reduce the number of dimensions in the data to represent them in the smallest possible number of features. This effectively captures the most important features of the data, helping in dimensionality reduction to inspect underlying patterns in the data or to be used for other models, like MLPs, CNNs or segmentation models to capture more broad features. So they take the input data, compress it to a small number of features/parameters and then reconstruct the data from said compressed features, which can also be used in anomaly detection, since the reconstruction will perform poorly when encountering data that differs greatly from the features of the general, "normal" data.

I am having a little bit difficulty figuring out if this autoencoder wants somehow different input data as MLP. Could you check how we prepare and take data in the current MLP implementation and in machine_learning_general.pymodule and see if the same data preparation would suffice for autoencoder?

It is true that MLP wants different kind of data, since autoencoder is unsupervised learning i.e. it does not receive ground truth labels but instead only the raw data, i.e. it works like clustering and a trained model can then be used with other models or to inspect the bottleneck layer features. Therefore it can be inputted just the data, this is convolutional autoencoder so it wants images, and they can be either cut into correct sizes or resized to have the same shape that the autoencoder takes as input (as input shape like 1024 can be quite demanding). The inputs and parametrized model shapes should be powers of two, especially since U-net is hard to make to work when not getting even filter numbers, and one needs to use padding etc.

nmaarnio commented 7 months ago

Thanks for the comments and guidance @rajuranpe ! No problem with the delay, it happens.

GispoCoding / eis_toolkit

143 add autoencoders #334