modAL-python / modAL

A modular active learning framework for Python
https://modAL-python.github.io/
MIT License
2.23k stars 324 forks source link

Keras regressor integration issues #85

Open cpfpengfei opened 4 years ago

cpfpengfei commented 4 years ago

After initializing a keras model:

def build_keras_regressor_model():
  model = keras.Sequential([
    layers.Dense(64, activation='relu', input_shape=[len(train_dataset.keys())]),
    layers.Dense(64, activation='relu'),
    layers.Dense(1)
  ])

  optimizer = tf.keras.optimizers.RMSprop(0.001)

  model.compile(loss='mse',
                optimizer=optimizer,
                metrics=['mae', 'mse'])
  return model

And using the KerasRegressor from keras sklearn wrapper, initializing the ActiveLearner has no issues and the initial metrics can be evaluated:

learner = ActiveLearner(
    estimator=regressor,
    X_training=X_initial, y_training=y_initial,
    verbose=1
)

However, the active learning loop below throws an error of

AttributeError: 'KerasRegressor' object has no attribute 'predict_proba'
n_queries = 50
for idx in range(n_queries):
    query_idx, query_instance = learner.query(X_pool, verbose=0)
    learner.teach(
        X=X_pool[query_idx], y=y_pool[query_idx], only_new=True,
        verbose=1
    )

Does modAL work with KerasRegressor?

Many thanks!

cosmic-cortex commented 4 years ago

Hi! You need to use the scikit-learn wrapper for Keras. If the Keras built-in scikit-learn wrapper works with the regressor, it should work in modAL. Let me know if there are any issues!

cpfpengfei commented 4 years ago

Hi, thank you for your reply! Yes I am using KerasRegressor from the scikit-learn wrapper for Keras. In the initialisation of the active learner step with the estimator set as the regressor, it works fine and the model is successfully fitted for the initial data.

But upon running the active learning loop, it throws the error mentioned above.

cosmic-cortex commented 4 years ago

Come to think of it, a neural network based regressor may not have a way to estimate the prediction probabilities, hence no predict_proba method by default. It can be done for instance by adding dropout layers (Dropout as a Bayesian Approximation:Representing Model Uncertainty in Deep Learning). This you have to implement yourself for your model, but from there, you'll be able to use it for active learning.

Here is an implementation in Keras: https://gdmarmerola.github.io/risk-and-uncertainty-deep-learning/

cpfpengfei commented 4 years ago

Ooh, wow thanks a lot for your help! For my regression model, I am trying to implement bayesian optimization with max EI as query strategy for active learning. With what you mentioned, I would have to implement AL from scratch with inspiration from your modAL package instead of plugging from your current packages?

cosmic-cortex commented 4 years ago

You can definitely use modAL. You just need to implement a predict_proba method for your custom estimator, but it will be perfectly usable. There is an entire page in the documention about this, check it out, it will help you in detail on how can you use custom objects! You can find it here: Extending modAL

cpfpengfei commented 4 years ago

Hi, thanks a lot for your help previously. I went back and tried using Bayesian optimizer with query strategy of max Expected Improvement based on prediction values and their uncertainties but I have some questions.

  1. What kind of uncertainty values are we supposed to have for this to work?
def optimizer_EI(optimizer: BaseLearner, X: modALinput, tradeoff: float = 0) -> np.ndarray:
     mean, std = optimizer.predict(X, return_std=True) 
     mean, std = mean.reshape(-1, ), std.reshape(-1, )
  1. For the above optimizer, I managed to do something similar for my model (optimizer) and output a prediction value and its uncertainty (in standard deviation) through the method of having dropout in every layer. Both values are normalized since the training and test set data are normalized beforehand. Will my outputs be similar to what you would expect to have as per mean, std here?
z = (pred - y_max - tradeoff) / std
EI = (pred - y_max - tradeoff)*ndtr(z) + std*norm.pdf(z) 
  1. I printed out the EI array based on the above calculations and they seem to give rather small EI values that are very close to 0. Am I doing something wrong here?

  2. What is the significance of y_max and tradeoff values? If my objective is to have a small y value, should y_max be y_min instead?

Thank you very much for your time!

cosmic-cortex commented 4 years ago

In general, Bayesian optimization is done with using a Gaussian process to model your quantity, so this is what the optimizer_EI and other functions suggest.

Regarding 2. and 3.: In principle, you can do this with a neural network as well. In this case, the mean and std of the predictions can be obtained by adding dropout and generating a bunch of predictions as you did. The mean and std of these will work, but I am unaware of how they perform in practice. If your EI is small, it means that your estimator thinks you probably won't improve by evaluating the function at other points.

Regarding 4.: I can refer this excellent review: https://www.cs.ox.ac.uk/people/nando.defreitas/publications/BayesOptLoop.pdf Although it is lengthy, it explains things much better than I am able to :)

fehmidaUsmani commented 3 years ago

Hi, thanks a lot for your help previously. I went back and tried using Bayesian optimizer with query strategy of max Expected Improvement based on prediction values and their uncertainties but I have some questions.

  1. What kind of uncertainty values are we supposed to have for this to work?
def optimizer_EI(optimizer: BaseLearner, X: modALinput, tradeoff: float = 0) -> np.ndarray:
     mean, std = optimizer.predict(X, return_std=True) 
     mean, std = mean.reshape(-1, ), std.reshape(-1, )
  1. For the above optimizer, I managed to do something similar for my model (optimizer) and output a prediction value and its uncertainty (in standard deviation) through the method of having dropout in every layer. Both values are normalized since the training and test set data are normalized beforehand. Will my outputs be similar to what you would expect to have as per mean, std here?
z = (pred - y_max - tradeoff) / std
EI = (pred - y_max - tradeoff)*ndtr(z) + std*norm.pdf(z) 
  1. I printed out the EI array based on the above calculations and they seem to give rather small EI values that are very close to 0. Am I doing something wrong here?
  2. What is the significance of y_max and tradeoff values? If my objective is to have a small y value, should y_max be y_min instead?

Thank you very much for your time!

Hi @cpfpengfei Did you find any strategy to work with keras regressor? I need help in this regard.