Closed xhzhao closed 7 years ago
Would RowStack work for you?
// RowStack (input0, input1, ...)
// stacks multiple inputs on top of each other
// The inputs will be spliced w.r.t. their first tensor dimension (the "row" dimension).
Let's say you have 4 2D tensors, each of size [w_i x h]. RowStack will give you one 2D tensor [(w_1+w_2+w_3+w_4) x h]. Currently, CNTK can only stack along the first dimension.
Note that "on top" here refers to a "matrix interpretation" of the tensor where the first dimension is the "tall" dimension. The "tall" dimension is the dimension with the fastest-changing index, the dimension with stride 1. For image tensors [W x H x C], the "tall" dimension is W, not H.
If this works for you except for the requirement of stacking along the first dimension, we can use TransposeDimension() to work this out.
// TransposeDimensions (input, dim1, dim2)
// - swaps index dimensions dim1 and dim2. The values are 1-based; 1 stands for the leading dimension.
// - new dimensions can be created; e.g. a column vector can be transposed into a row vector, which is a [1 x N] tensor
// - transposing into the time dimension is currently not supported
// - internally implemented with tensor lib by shuffling dimensions with their strides
// - input may be minibatch data or not
// Transpose (input) = TransposeDimensions (input, 1, 2)
Alternatively, we could change the RowStack C++ code. The node code itself actually can already stack along arbitrary dimensions, and it would take maybe 10 lines of code to expose that to NDL. The reasons that it is not exposed presently is mostly because it would require a name change (since it's no longer stacking "rows"), so we'd have to deal with back compatibility; and we'd need test cases.
@frankseide
Hi Frank,
Just want to use the TransposeDimensions() not for purpose of GoogLeNet, though. But I got a "undefined function or macro" error. It seems that it is commented out in CNTK/Source/ActionsLib/NetworkDescriptionLanguage.cpp, isn't it? But if I uncomment it and make again, a segmentation fault arises (I write something like "input_transpose = TransposeDimensions (input, 2,3)" ).
Any advice? Thanks!
Hi, xhzhao,
Did you successfully run GoogLeNet on CNTK? Thanks a lot!
If anyone has GoogLeNet working, I would like to try it out as well. Thanks.
@wangxianliang no, i tried to use TransposeDimensions(),but failed.
Does anybody know how to do the tranpose in NDL? When I am using ArrayTransposeDimensions, it says "EXCEPTION occurred: Undefined function or macro 'ArrayTransposeDimensions'" The same for TransposeDimensions.
Please help!
You must use BrainScriptNetworkBuilder, cf. #576.
Please checkout https://www.microsoft.com/en-us/cognitive-toolkit/features/model-gallery/?filter=Image, which has GoogLeNet. Please file a new issue if you run into problems. Thanks!
I need to run GoogLeNet on CNTK, while there is no such examples. I tried to write my own config and ndl file, but i encountered a problem: How to do "DepthConcat" in CNTK?
The example code in Torch is on this link:https://github.com/soumith/convnet-benchmarks/blob/master/torch7/imagenet_winners/googlenet.lua Torch use the operator "Concat" to do this. Is there any similar operator in CNTK? How to use it?