Open artbataev opened 5 years ago
You probably will have to do a C.sequence.unpack, slice from there, and C.to_sequence.
@delzac, it also doesn't work:
n_channels = 3
input_var = cntk.sequence.input_variable([cntk.FreeDimension, n_channels])
unpacked_input = cntk.sequence.unpack(input_var, padding_value=0, no_mask_output=True)
sliced = cntk.slice(unpacked_input, axis=1, begin_index=0, end_index=0, strides=3)
model = cntk.to_sequence(sliced)
x = np.random.rand(1, 6, n_channels).astype(np.float32)
print(x)
print(model.eval({model.arguments[0]: x}))
raises exception:
RuntimeError: Function 'Slice: Output('UnpackSequenceOp17879_Output_0', [#], [* x * x 3]) -> Unknown': Slice operation index range [0,0), interpreted as [0,-3), is invalid for input 'Output('UnpackSequenceOp17879_Output_0', [#], [* x * x 3])' shape '[* x * x 3]'.
But works correctly with axis=2
First, your sample code runs without error on my computer. What version of cntk are you using?
Also, there's no need to define a C.FreeDimension, using C.sequence.input_variable already defines a sequence axis.
input_var = cntk.sequence.input_variable(n_channels) # this would do
Anyway, i tested it out. Seems like unpacking a sequence and then slicing it causes a RuntimeError, might be a bug though.
But if you keep to using a C.input_variable((C.FreeDimension, n)) it works fine.
import cntk as C
import numpy as np
n_channels = 3
input_var = C.sequence.input_variable(n_channels) # RuntimeError: NarrowTo: stride 3 is invalid for interval [0, 1).
# input_var = C.input_variable([C.FreeDimension, n_channels]) # works fine
print(input_var.shape)
unpacked_input = C.sequence.unpack(input_var, padding_value=0, no_mask_output=True)
print(unpacked_input.shape)
sliced = C.slice(unpacked_input, axis=0, begin_index=0, end_index=0, strides=3)
print(sliced.shape)
model = C.to_sequence(sliced)
print(model.shape)
x = np.random.rand(6, n_channels).astype(np.float32)
print(x)
print(sliced.eval({model.arguments[0]: [x]}))
Thank you very much, @delzac, this code works with CNTK 2.6. I also tried to use it with CNTK 2.4, but it fails.
Unfortunately, I have more complex model (with recurrence), so I have to pack and unpack a sequence, and it fails in CNTK 2.6
Fails
n_channels = 3
input_var = C.input_variable([C.FreeDimension, n_channels])
print(input_var.shape)
packed_input_var = cntk.to_sequence(input_var)
unpacked_input_var = cntk.sequence.unpack(packed_input_var, padding_value=0, no_mask_output=True)
sliced = C.slice(unpacked_input_var, axis=0, begin_index=0, end_index=0, strides=3)
print(sliced.shape)
model = C.to_sequence(sliced)
print(model.shape)
x = np.random.rand(6, n_channels).astype(np.float32)
print(x)
print(sliced.eval({model.arguments[0]: [x]}))
with RuntimeError: NarrowTo: stride 3 is invalid for interval [0, 1).
The error clearly comes from slicing an unpacked sequence. But i'm not sure how to help you from here too. :(
Can you do the slicing while its still a free dimension. Or do you slice them after the recurrence and hence it will always be in the sequence axis?
I do slice after the recurrence :(
I found the way to do it, but it is very ugly (now I use 1-d convolution with identity matrix and stride=3)
That is ingenious! I learn something today, thanks!
@KeDengMS Do you have a better solution?
@artbataev Hi, i found myself needing to stride on the sequence axis too. Can i check how did you initialise the kernel? I found that the current cntk python api blocks me from initialising through init=my_kernel
@delzac, for now I found a better solution: use maxpooling, not convolution (since convolution is very slow)
def subsample(input_, subsampling_factor, n_channels):
output = cntk.sequence.unpack(input_, padding_value=0, no_mask_output=True)
# output = cntk.expand_dims(output, axis=0) # this doesn't work, possibly a bug in CNTK
output = cntk.reshape(output, (1, -1, n_channels)) # adding additional dimension
sliced = cntk.layers.MaxPooling((1, 1, 1), strides=(1, subsampling_factor, 1))(output)
sliced = cntk.reshape(sliced, (-1, n_channels)) # removing additional dimension
output = cntk.to_sequence(sliced)
return output
There are also some strange things about this solution. It seems that reshaping tensor is unnecessary, and code can be simpilfied:
def subsample(input_, subsampling_factor):
"""Be careful: this doesn't work on GPU!"""
output = cntk.sequence.unpack(input_, padding_value=0, no_mask_output=True)
sliced = cntk.layers.MaxPooling((1, 1), strides=(subsampling_factor, 1))(output)
output = cntk.to_sequence(sliced)
return output
But is works well only on CPU, on GPU there is an error:
RuntimeError: cuDNN failure 3: CUDNN_STATUS_BAD_PARAM ; GPU=0 ; hostname=... ; expr=cudnnPoolingForward(*m_cudnn, *(m_pool), &C::One, m_inT, ptr(in), &C::Zero, m_outT, ptr(out))
What about convolution, I used this code, which also works, but is significantly slower:
import numpy as np
def subsample(input_, subsampling_factor, n_channels):
output = cntk.transpose(cntk.sequence.unpack(input_, padding_value=0, no_mask_output=True), perm=[1, 0])
output = cntk.convolution(
cntk.Constant(np.eye(n_channels, n_channels, dtype=np.float32).reshape(n_channels, n_channels, 1)),
output,
strides=[1, subsampling_factor],
dilation=(1, 1),
auto_padding=[False, False],
) # out_channels, in_channels, kernel_size
output = cntk.to_sequence(cntk.transpose(output, perm=[1, 0]))
return output
@artbataev Thanks for sharing, i managed to work it out too. I used SequentialConvolution to do it.
Your maxpooling approach is a wonderful idea too. But how do you ensure that the pad_values are not included in the stride when you use a sequence.unpack and C.to_sequence earlier?
Anyhow, you can do this to avoid reshape:
C.expand_dims(x, axis=C.Axis.new_leading_axis())
...
C.squeeze()
@delzac
Your maxpooling approach is a wonderful idea too. But how do you ensure that the pad_values are not included in the stride when you use a sequence.unpack and C.to_sequence earlier?
There is no need to worry about it, because maxpooling (or averagepooling) operation in this case doesn't actually perform any pooling (kernel is [1,1,1], so it takes the element itself, but with stride to take every k element):
sliced = cntk.layers.MaxPooling((1, 1, 1), strides=(1, subsampling_factor, 1))(output)
C.expand_dims(x, axis=C.Axis.new_leading_axis())
this works, thank you!
C.squeeze()
Unfortunately, squeeze doesn't work correctly with tensor after being sequence, so can't use it =(
If you ask about changes in shape after sequence.unpack, I think there is no better solution except to track manually correct shape of the tensor, and use it with C.to_sequence
.
@artbataev got it! Thanks for your inputs :)
I thought of a cleaner implementation for seqeunce.stride
. It will work regardless of the number of static axes you have in the sequence.
Just leaving the code here in case anyone else needs it. The master can be found in cntkx in own cntk extension library. Can just do a pip install cntkx
to get it.
def stride(x, s: int, tol: float = 0.1):
p = position(x)
integers = p / s # every s sequence item will be an integer
valid = C.less_equal(C.abs(C.sin(integers * pi)), tol) # sin of integer multiple of pi will return close to zero
result = C.sequence.gather(x, valid)
return result
def position(x, name=''):
@C.BlockFunction('position', name)
def inner(a):
# reconcile_dynamic_axes is necessary to avoid subtle bugs e.g. sequence.where and one_hot
return C.reconcile_dynamic_axes(C.sequence.where(C.ones_like(Cx.scalar(a))), a)
return inner(x) # {#, *] [1,]
@delzac, thank you for the solution, I'll try it! Have you measured speed of this implementation against Convolution / MaxPooling?
@artbataev I tested against sequential convolution and there wasn't any substantial difference!
Is it possible to do slice with step along dynamic axis? I'm working with sequence model and want to take every 3rd frame.
I found that
cntk.sequence.slice
doesn't supportstep
parameter.Also this example
raises an error: