ShuaiGuo16 / Active_Learning

A walk through of applying an active learning strategy to efficiently train a Gaussian Process model.
2 stars 3 forks source link

Active-learning question #1

Open Daniel-Trung-Nguyen opened 3 years ago

Daniel-Trung-Nguyen commented 3 years ago

Hi Shuai Guo, I came here from your active learning post on Towardsdatascience. Super cool post and thanks for sharing! You showed an example of active learning for univariable function (F(x)) but I am curious about extending the method for multiple variables. I have a very slow process-based model that takes a lot of inputs and I want to build a surrogate of this model using your method so I can do the UQ more efficiently. Some insights and code examples are much appreciated. Thanks, Daniel

ShuaiGuo16 commented 3 years ago

Hi Daniel,

Thank you for your question and for your interest in my post! The same methodology I showed in the blog also extends to multivariate cases. However, for Gaussian Process models, normally they cannot effectively handle a large number of inputs (>20 features, rule of thumb). When you have lots of input parameters, the required training data would grow very fast, and so is the training time, due to the matrix inversion computations involved. This is a common problem of the so-called "curse of dimensionality".

For this case, it is common to perform dimensionality reduction first as a pre-processing step. You can adopt subset-based dimensionality reduction approaches, where you only select inputs that you believe will highly influence the output (based on sensitivity analysis results), or you can do subspace-based dimensionality reduction, where you project your original input data to a lower-dimensional subspace space. In this regard, you could try a method called "Active Subspace", which is kind of like a supervised version of PCA, and performs very well in UQ settings. Regardless of which dimensionality reduction approach you adopted, once you have managed to reduce the input features, you can move on with the same methodology shown in the blog to build the surrogate model.

As an alternative, you could try other surrogate models that are less susceptible to the curse of dimensionality, such as neural network. I would recommend this paper: https://www.sciencedirect.com/science/article/pii/S0021999118305655.

If you want to work with Gaussian Process and leverage its active learning property, I have this recommendation for you: https://link.springer.com/article/10.1007/s00158-015-1395-9. In that paper, the authors developed a partial least square method to train GP models with high-dimensional input. I hope it could serve as a starting point for you to develop more tailored solutions for your problem at hand.

Hope these comments could be helpful for your problem. Good luck!

Best, Shuai

Daniel-Trung-Nguyen commented 2 years ago

Hi Shuai, I just realized that I have not replied to your message. Thanks a lot for the great resources! I have now looked at your surrogate optimization article and found it also very interesting and intuitive. Thanks for breaking down complicated concepts in such an easy way to understand. One of the best tutorials I found on Medium!!! I am still a bit confused on how to adapt that for a case of 2, 3 input variables. In your examples, new_sample = candidates[np.argmax(EI)]. How can I generate candidates matrix of higher dimensions and then calculate the corresponding EI. Might be a really basic question but sorry I don't have much experience with matrix manipulation. Appreciate your help but I understand if it takes much of your time. Thanks, Daniel