EmbraceLife commented 7 years ago

How to get every activation layer output?

With model(weights, biases) and a single data sample, how can I get each activation layer output?

The workflow: a data sample --> reshape (if needed) --> dot(weights) + biases --> --> apply activation function --> activation layer output

Is there an easy way of getting activation layer output in kur?

Below is my exploration, but not successful at all.

What I know:

model(weights, biases) at each epoch or even each batch can be assessed
I don't know how to access each activation tensors directly
- it seems kur use keras (not sure about pytorch) directly offer prediction and loss through keras_backend.run_batch
- I don't see an obvious way to access each activation values directly
But indirectly, activation values can be calculated: using a sample, weights&biases, activation functions
- Can I use kur.layer.activation object to perform activation on numpy.array?

Logic flow: get output of each layer

when do I need output from each layer?
- at the end of a batch or an epoch
how do I get the output of each layer?
- backward: from trained weights + inputs + (activation) ==> layer output
how to get input?
- through batch_provider, throw out batch one at a time
how to get weights?
- after a batch or epoch of training, or certain number of them
- create a temp folder and save model weights into it
- select a particular layer's weights + bias files from the folder and open with idx.open, to get numpy array of weigths and biases
how to get activation?
- through numpy functions to create activation operator (more likely chosen), e.g., tanh: np.tanh; relu: np.relu (not exist)
- use keras.activation operator if possible?
  - How to use relu activation directly from keras
    - from keras import backend as K
    - K.relu(x)
- use pytorch activation operator if possible? (more likely to be the easier solution found here)
- best solution?
  - apply numpy.arrays to kur.layer.Activation objects, such as softmax, leakyrelu
  - so that, no need to convert numpy array to Theano.tensor or Tensorflow tensor, or pytorch tensor, and back?

EmbraceLife commented 7 years ago

Now, I managed to use pytorch to get each layer's output from a data sample and available weights,biases files (see code below)

But I wonder is there a better way of doing it?

# create a single data sample
    sample = None
    for batch in provider:
        sample = batch['images'][0]
        break

    # make weights and biases available as numpy arrays
    import kur.utils.idx as idx
    # shape (10, 784)
    dense_w = idx.load("../../Hvass_tutorial1_folders/mnist.best.valid.w/layer___dense_0+weight.kur")
    # shape (10,)
    dense_b = idx.load("../../Hvass_tutorial1_folders/mnist.best.valid.w/layer___dense_0+bias.kur")
    # sample (1, 784), weight(784, 10) + bias (10,)

    import torch
    import torch.nn.functional as F
    from torch.autograd import Variable

    # convert data sample, weights, biases to torch.Variable
    # get numpy into Tensor, then to Variable
    sample_tensor = torch.from_numpy(sample)
    sample_var = Variable(sample_tensor)
    weight_tensor = torch.from_numpy(dense_w)
    weight_var = Variable(weight_tensor)
    bias_tensor = torch.from_numpy(dense_b)
    bias_var = Variable(bias_tensor)

    # flatten layer
    # reshape sample_var
    sample_flat = sample_var.view(1, -1)

    # hidden 10 node dense layer
    # transpose a tensor or variable
    weight_var_t = torch.t(weight_var)
    # matmul: torch.mm
    sample_w = torch.mm(sample_flat, weight_var_t)
    # add: torch.add
    # hidden layer output
    sample_wb = torch.add(sample_w, bias_var)
    # size(): see shape (1, 10)
    # sample_wb.size()

    # activation layer output
    # apply F.softmax
    output = F.softmax(sample_wb)
    # test F.softmax
    # output.sum()
    # torch.sum(output)

ajsyp commented 7 years ago

It depends on how you want to do it. By far the easiest way is to simply mark the appropriate layers as additional output layers in the Kurfile. This can be done as part of the layer declaration using the sink parameter (you'll probably want to give your layer a better name than the default-generated one. but that isn't necessary):

model:
  # ... previous layers
  - dense: 10
    sink: yes
    name: my_dense_layer
  # ... more layers

It can also be done with multiple, explicit output layers:

model:
  # ... previous layers
  - dense: 10
  - output: my_dense_layer
  # ... more layers

This will cause the additional activations to be outputted/captured at each batch, as usual, alongside all the other model outputs/predictions in the output dictionary (e.g., the result of Backend.train/test/evaluate).

Definitely don't do it the way you just did: you're basically rebuilding the low-level tensor operations, which is presumably what you want to avoid by using Kur :)

EmbraceLife commented 7 years ago

yes, you are right it was painful to go back to low-level tensor ops for features when kur can do it easily like what you suggested above. Yesterday, I finally managed to get intermediate layer output using keras and made it plot convol layers in kur. But look at the method you put above, it makes things so much easier. I will give it a try now. Thanks a lot!

deepgram / kur