keras-team / keras

Deep Learning for humans
http://keras.io/
Apache License 2.0
61.89k stars 19.45k forks source link

Custom layer: trainable threshold #6926

Closed zeka0 closed 7 years ago

zeka0 commented 7 years ago

I attempt to build a 'ThresholdLayer' that changes input data to 0 and 1 depending on the threshold. The threshold is a trainable Tensorflow Variable. However I'm getting weird results, please help!

import pickle
import tensorflow as tf
import numpy as np
import keras
from keras.layers import Lambda
from keras import losses, activations, optimizers, initializers
from keras.models import *
from keras.engine.topology import Layer

sess = tf.Session()

with open(r'E:\VirtualDesktop\pandas\planet\pkl\predict.pkl', 'rb') as f:
    data = pickle.load(f)
with open(r'E:\VirtualDesktop\pandas\planet\pkl\geological_tags.pkl', 'rb') as f:
    target = pickle.load(f)
target = target[:data.shape[0]]
input_shape = data.shape[1:]
input_layer = Input(shape=input_shape)

threshold = tf.Variable(initial_value=0.2, trainable=True)
class ThresholdLayer(Layer):
    def __init__(self, **kwargs):
        super(ThresholdLayer, self).__init__(**kwargs)

    def build(self, input_shape):
        # Create a trainable weight variable for this layer.
        self.kernel = self.add_weight(name='threshold', shape=(1,), initializer=initializers.constant(0.2))
        super(ThresholdLayer, self).build(input_shape)  # Be sure to call this somewhere!

    def call(self, x):
        ge = tf.greater_equal(x, self.kernel)
        return tf.where(ge, x=tf.ones_like(x), y=tf.zeros_like(x))

    def compute_output_shape(self, input_shape):
        return input_shape

x = ThresholdLayer()(input_layer)

model = Model(inputs=input_layer, outputs=x)
model.compile(optimizer=optimizers.SGD(), loss=losses.MSE)

model.fit(data, target, batch_size=10, nb_epoch=1, verbose=True, shuffle=True)

print(sess.run(threshold))
stale[bot] commented 7 years ago

This issue has been automatically marked as stale because it has not had recent activity. It will be closed after 30 days if no further activity occurs, but feel free to re-open a closed issue if needed.

rodvei commented 6 years ago

I was thinking about trying to implementing something similar @zeka0, good job, this looks nice! What kind of "weird results" where you getting? Did you find a solution?

I dont think this is an issue with keras. I would suggest some of the problem comes from the operation defined in ThresholdLayer.call : y=0 if x<threshold, and y=1 if x>=threshold, which have a gradient = 0 for all threshold and infinite at the exact threshold value. This will probably makes the SGD unable to optimize the threshold with respect to MSE?

What about using an extreme Sigmoid function (S) like: S(C[x-threshold]), where C is a large constant. For example like this:

import numpy as np
import tensorflow.keras as keras
import tensorflow as tf 

np.random.seed(seed=1)
tf.set_random_seed(seed=1)
true_cutoff = 0.2
n=10000
rand_window = 0.05
x = np.random.rand(n).reshape(-1,1)
y = ((x+np.random.rand(n).reshape(-1,1)*2*rand_window-rand_window )>=true_cutoff)*1

input_layer = keras.Input(shape=(1,))
class ThresholdLayer(keras.layers.Layer):
    def __init__(self, **kwargs):
        super(ThresholdLayer, self).__init__(**kwargs)

    def build(self, input_shape):
        self.kernel = self.add_weight(name="threshold", shape=(1,), initializer="uniform",
                                      trainable=True)
        super(ThresholdLayer, self).build(input_shape)

    def call(self, x):
        return keras.backend.sigmoid(100*(x-self.kernel))

    def compute_output_shape(self, input_shape):
        return input_shape
out = ThresholdLayer()(input_layer)
model = keras.Model(inputs=input_layer, outputs=out)
model.compile(optimizer="sgd", loss="mse")

model.fit(x, y, epochs=5)
model.get_weights()

which in my case resulted in a threshold equal to 0.21047172 while the true threshold is 0.2. Probably better ways to do this, would love to see other suggestions!

skyap commented 4 years ago

image because you include randomness into y there is overlap as show in above plot. Therefore your result threshold not equal to true threshold. Below will get 0.20008925. Thanks for your code!

import numpy as np
import tensorflow.keras as keras
import tensorflow as tf 

np.random.seed(seed=1)
tf.set_random_seed(seed=1)
true_cutoff = 0.2
n=10000
x = np.random.rand(n).reshape(-1,1)
y = x>=true_cutoff

input_layer = keras.Input(shape=(1,))
class ThresholdLayer(keras.layers.Layer):
    def __init__(self, **kwargs):
        super(ThresholdLayer, self).__init__(**kwargs)

    def build(self, input_shape):
        self.kernel = self.add_weight(name="threshold", shape=(1,), initializer="uniform",
                                      trainable=True)
        super(ThresholdLayer, self).build(input_shape)

    def call(self, x):
        return keras.backend.sigmoid(100*(x-self.kernel))

    def compute_output_shape(self, input_shape):
        return input_shape
out = ThresholdLayer()(input_layer)
model = keras.Model(inputs=input_layer, outputs=out)
model.compile(optimizer="sgd", loss="mse")

model.fit(x, y, epochs=5)
model.get_weights()