CIFAR10 Example: Convolution operation currently only supports 1D or 2D convolution on 3D tensors.

Hi,

I'm trying to let the standard 01_Convolutinal.cntk run for the CIFAR10 tutorial.

However I'll always end up with the following error:

EXCEPTION occurred: Convolution operation currently only supports 1D or 2D convolution on 3D tensors.

Without success I've always tried to change "deviceId" to equals zero as some suggested.

This is the whole .log file:

-------------------------------------------------------------------
Build info: 

        Built time: Feb  4 2016 16:45:41
        Last modified date: Fri Jan 29 09:00:05 2016
        CUDA_PATH: C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v7.0
        CUB_PATH: C:\NVIDIA\cub-1.4.1
        CUDNN_PATH: C:\NVIDIA\cudnn-4.0\cuda
        Build Branch: 
        Build SHA1: 
        Built by gmaerz on DE00-IN484-L           
        Build Path: C:\Users\gmaerz\OneDrive\CNTK\Source\CNTK\
-------------------------------------------------------------------
running on DE00-IN484-L at 2016/03/02 23:51:31
command line: 
C:\Users\gmaerz\OneDrive\CNTK\x64\Debug\CNTK.exe configFile=01_Conv.config 

>>>>>>>>>>>>>>>>>>>> RAW CONFIG (VARIABLES NOT RESOLVED) >>>>>>>>>>>>>>>>>>>>
RootDir = "."
ConfigDir = "$RootDir$"
DataDir = "$RootDir$"
OutputDir = "$RootDir$/Output"
ModelDir = "$OutputDir$/Models"
ndlMacros=$ConfigDir$/Macros.ndl
precision=float
deviceId=Auto
prefetch=true
command=Train:Test
stderr=$OutputDir$/01_Conv
traceLevel=1
numMBsToShowResult=500
Train=[
    action=train
    modelPath=$ModelDir$/01_Convolution
     NDLNetworkBuilder=[
        networkDescription=$ConfigDir$/01_Convolution.ndl
    ]
    SGD=[
        epochSize=49984
        minibatchSize=64
        learningRatesPerMB=0.01*10:0.003*10:0.001
        momentumPerMB=0.9*20:0.99
        maxEpochs=30
        L2RegWeight=0.03
        dropoutRate=0*5:0.5
    ]
    reader=[
        readerType=UCIFastReader
        file=$DataDir$/Train.txt
        randomize=None
        features=[
            dim=3072
            start=1
        ]
        labels=[
            dim=1
            start=0
            labelDim=10
            labelMappingFile=$DataDir$/labelsmap.txt
        ]
    ]    
]
Test=[
    action=test
    modelPath=$ModelDir$/01_Convolution
    minibatchSize=16
     NDLNetworkBuilder=[
        networkDescription=$ConfigDir$/01_Convolution.ndl
    ]
    reader=[
        readerType=UCIFastReader
        file=$DataDir$/Test.txt
        randomize=None
        features=[
            dim=3072
            start=1
        ]
        labels=[
            dim=1
            start=0
            labelDim=10
            labelMappingFile=$DataDir$/labelsmap.txt
        ]
    ]    
]

<<<<<<<<<<<<<<<<<<<< RAW CONFIG (VARIABLES NOT RESOLVED)  <<<<<<<<<<<<<<<<<<<<

>>>>>>>>>>>>>>>>>>>> RAW CONFIG WITH ALL VARIABLES RESOLVED >>>>>>>>>>>>>>>>>>>>
RootDir = "."
ConfigDir = "."
DataDir = "."
OutputDir = "./Output"
ModelDir = "./Output/Models"
ndlMacros=./Macros.ndl
precision=float
deviceId=Auto
prefetch=true
command=Train:Test
stderr=./Output/01_Conv
traceLevel=1
numMBsToShowResult=500
Train=[
    action=train
    modelPath=./Output/Models/01_Convolution
     NDLNetworkBuilder=[
        networkDescription=./01_Convolution.ndl
    ]
    SGD=[
        epochSize=49984
        minibatchSize=64
        learningRatesPerMB=0.01*10:0.003*10:0.001
        momentumPerMB=0.9*20:0.99
        maxEpochs=30
        L2RegWeight=0.03
        dropoutRate=0*5:0.5
    ]
    reader=[
        readerType=UCIFastReader
        file=./Train.txt
        randomize=None
        features=[
            dim=3072
            start=1
        ]
        labels=[
            dim=1
            start=0
            labelDim=10
            labelMappingFile=./labelsmap.txt
        ]
    ]    
]
Test=[
    action=test
    modelPath=./Output/Models/01_Convolution
    minibatchSize=16
     NDLNetworkBuilder=[
        networkDescription=./01_Convolution.ndl
    ]
    reader=[
        readerType=UCIFastReader
        file=./Test.txt
        randomize=None
        features=[
            dim=3072
            start=1
        ]
        labels=[
            dim=1
            start=0
            labelDim=10
            labelMappingFile=./labelsmap.txt
        ]
    ]    
]

<<<<<<<<<<<<<<<<<<<< RAW CONFIG WITH ALL VARIABLES RESOLVED <<<<<<<<<<<<<<<<<<<<

>>>>>>>>>>>>>>>>>>>> PROCESSED CONFIG WITH ALL VARIABLES RESOLVED >>>>>>>>>>>>>>>>>>>>
configparameters: 01_Conv.config:command=Train:Test
configparameters: 01_Conv.config:ConfigDir=.
configparameters: 01_Conv.config:DataDir=.
configparameters: 01_Conv.config:deviceId=Auto
configparameters: 01_Conv.config:ModelDir=./Output/Models
configparameters: 01_Conv.config:ndlMacros=./Macros.ndl
configparameters: 01_Conv.config:numMBsToShowResult=500
configparameters: 01_Conv.config:OutputDir=./Output
configparameters: 01_Conv.config:precision=float
configparameters: 01_Conv.config:prefetch=true
configparameters: 01_Conv.config:RootDir=.
configparameters: 01_Conv.config:stderr=./Output/01_Conv
configparameters: 01_Conv.config:Test=[
    action=test
    modelPath=./Output/Models/01_Convolution
    minibatchSize=16
     NDLNetworkBuilder=[
        networkDescription=./01_Convolution.ndl
    ]
    reader=[
        readerType=UCIFastReader
        file=./Test.txt
        randomize=None
        features=[
            dim=3072
            start=1
        ]
        labels=[
            dim=1
            start=0
            labelDim=10
            labelMappingFile=./labelsmap.txt
        ]
    ]    
]

configparameters: 01_Conv.config:traceLevel=1
configparameters: 01_Conv.config:Train=[
    action=train
    modelPath=./Output/Models/01_Convolution
     NDLNetworkBuilder=[
        networkDescription=./01_Convolution.ndl
    ]
    SGD=[
        epochSize=49984
        minibatchSize=64
        learningRatesPerMB=0.01*10:0.003*10:0.001
        momentumPerMB=0.9*20:0.99
        maxEpochs=30
        L2RegWeight=0.03
        dropoutRate=0*5:0.5
    ]
    reader=[
        readerType=UCIFastReader
        file=./Train.txt
        randomize=None
        features=[
            dim=3072
            start=1
        ]
        labels=[
            dim=1
            start=0
            labelDim=10
            labelMappingFile=./labelsmap.txt
        ]
    ]    
]

<<<<<<<<<<<<<<<<<<<< PROCESSED CONFIG WITH ALL VARIABLES RESOLVED <<<<<<<<<<<<<<<<<<<<
command: Train Test 
precision = float
CNTKModelPath: ./Output/Models/01_Convolution
CNTKCommandTrainInfo: Train : 30
CNTKCommandTrainInfo: CNTKNoMoreCommands_Total : 30
CNTKCommandTrainBegin: Train
LockDevice: Locked GPU 0 to test availability.
LockDevice: Unlocked GPU 0 after testing.
LockDevice: Locked GPU 0 for exclusive use.
NDLBuilder Using GPU 0
Reading UCI file ./Train.txt
WARNING: trying to use cuDNN on unsupported platform. It is safe to ignore the warning if it's produced during model editing command.
WARNING: trying to use cuDNN on unsupported platform. It is safe to ignore the warning if it's produced during model editing command.
WARNING: trying to use cuDNN on unsupported platform. It is safe to ignore the warning if it's produced during model editing command.
WARNING: trying to use cuDNN on unsupported platform. It is safe to ignore the warning if it's produced during model editing command.
WARNING: trying to use cuDNN on unsupported platform. It is safe to ignore the warning if it's produced during model editing command.
WARNING: trying to use cuDNN on unsupported platform. It is safe to ignore the warning if it's produced during model editing command.
Microsoft::MSR::CNTK::GPUMatrix<ElemType>::SetGaussianRandomValue (GPU): creating curand object with seed 1, sizeof(ElemType)==4

Post-processing network...

3 roots:
    CE = CrossEntropyWithSoftmax
    Err = ErrorPrediction
    OutputNodes.z = Plus
FormNestedNetwork: WARNING: Was called twice for CE CrossEntropyWithSoftmax operation
FormNestedNetwork: WARNING: Was called twice for Err ErrorPrediction operation
FormNestedNetwork: WARNING: Was called twice for OutputNodes.z Plus operation

Validating for node CE. 33 nodes to process in pass 1.

Validating --> labels = InputValue -> [10 {1} x *]
Validating --> OutputNodes.W = LearnableParameter -> [10 x 64 {1,10}]
Validating --> h1.W = LearnableParameter -> [64 x 576 {1,64}]
Validating --> conv3_act.W = LearnableParameter -> [64 x 800 {1,64}]
Validating --> conv2_act.W = LearnableParameter -> [32 x 800 {1,32}]
Validating --> conv1_act.W = LearnableParameter -> [32 x 75 {1,32}]
Validating --> features = InputValue -> [32 x 32 x 3 {1,32,1024} x *]
Validating --> featOffs = LearnableParameter -> [1 x 1 {1,1}]
Validating --> featScaled = Minus(features[32 x 32 x 3 {1,32,1024} x * {W=32, H=3, C=32}], featOffs[1 x 1 {1,1}]) -> [32 x 32 x 3 {1,32,1024} x *]
Validating --> conv1_act.c = Convolution(conv1_act.W[32 x 75 {1,32}], featScaled[32 x 32 x 3 {1,32,1024} x * {W=32, H=3, C=32}]) -> [32 x 32 x 32 {1,32,1024} x *]
Validating --> conv1_act.b = LearnableParameter -> [1 x 1 x 32 x 1 {1,1,1,32}]
Validating --> conv1_act.p = Plus(conv1_act.c[32 x 32 x 32 {1,32,1024} x * {W=32, H=32, C=32}], conv1_act.b[1 x 1 x 32 x 1 {1,1,1,32}]) -> [32 x 32 x 32 x 1 {1,32,1024,32768} x *]
Validating --> conv1_act.y = RectifiedLinear(conv1_act.p[32 x 32 x 32 x 1 {1,32,1024,32768} x *]) -> [32 x 32 x 32 x 1 {1,32,1024,32768} x *]
Validating --> pool1 = MaxPooling(conv1_act.y[32 x 32 x 32 x 1 {1,32,1024,32768} x *])About to throw exception 'Convolution operation currently only supports 1D or 2D convolution on 3D tensors.'

[CALL STACK]
    >Microsoft::MSR::CNTK::ThrowFormatted<std::invalid_argument>
    -Microsoft::MSR::CNTK::InvalidArgument<>
    -Microsoft::MSR::CNTK::ImageDimensions::ImageDimensions
    -Microsoft::MSR::CNTK::PoolingNodeBase<float>::Validate
    -Microsoft::MSR::CNTK::MaxPoolingNode<float>::Validate
    -Microsoft::MSR::CNTK::ComputationNetwork::ValidateNodes
    -Microsoft::MSR::CNTK::ComputationNetwork::ValidateSubNetwork
    -Microsoft::MSR::CNTK::ComputationNetwork::CompileNetwork
    -Microsoft::MSR::CNTK::NDLBuilder<float>::LoadFromConfig
    -Microsoft::MSR::CNTK::NDLBuilder<float>::LoadNetworkFromConfig
    -Microsoft::MSR::CNTK::NDLBuilder<float>::BuildNetworkFromDescription
    -<lambda_129c9d8d27039b4c7cf30f939660c017>::operator()
    -std::_Callable_obj<<lambda_129c9d8d27039b4c7cf30f939660c017>,0>::_ApplyX<std::shared_ptr<Microsoft::MSR::CNTK::ComputationNetwork>,int>
    -std::_Func_impl<std::_Callable_obj<<lambda_129c9d8d27039b4c7cf30f939660c017>,0>,std::allocator<std::_Func_class<std::shared_ptr<Microsoft::MSR::CNTK::ComputationNetwork>,int> >,std::shared_ptr<Microsoft::MSR::CNTK::ComputationNetwork>,int>::_Do_call
    -std::_Func_class<std::shared_ptr<Microsoft::MSR::CNTK::ComputationNetwork>,int>::operator()
    -Microsoft::MSR::CNTK::SGD<float>::Train
    -DoTrain<Microsoft::MSR::CNTK::ConfigParameters,float>
    -DoCommands<float>
    -wmainOldCNTKConfig

EXCEPTION occurred: Convolution operation currently only supports 1D or 2D convolution on 3D tensors.
-------------------------------------------------------------------
Usage: cntk configFile=yourConfigFile
For detailed information please consult the CNTK book
"An Introduction to Computational Networks and the Computational Network Toolkit"
-------------------------------------------------------------------

Hello, Given that you have this message in the log:

WARNING: trying to use cuDNN on unsupported platform. It is safe to ignore the warning if it's produced during model editing command.

I would say something needs to be checked in your setup. What GPU are you running it on?

Hi,

I have a Nvidia NVS 5400m. I think my setup is fine, as the other examples(mnist e.g.) are running perfectly.

NVS 5400m is GPU of compute capability 2.1 (i.e. Fermi). cuDNN requires Kepler or higher GPUs (i.e. CC >= 3.0) so that's why you are seeing this error. What you can try is this:

Replace imageLayout in 01_Convolution.ndl from cudnn to legacy. This will switch to less efficient implementation of convolution routines which should run on Fermi GPU.
Re-run CIFAR_convert.py with legacy option: python CIFAR_convert.py -f legacy. This will convert CIFAR-10 data to legacy-compatible data layout (aka HWC).

Maybe we can find a way to discover this specific condition and improve the error message. I guess more users will run into the same problem.

From: Alexey Kamenev [mailto:notifications@github.com] Sent: Thursday, March 3, 2016 9:47 To: Microsoft/CNTK CNTK@noreply.github.com Subject: Re: [CNTK] CIFAR10 Example: Convolution operation currently only supports 1D or 2D convolution on 3D tensors. (#194)

NVS 5400m is GPU of compute capability 2.1 (i.e. Fermi). cuDNN requires Kepler or higher GPUs (i.e. CC >= 3.0) so that's why you are seeing this error. What you can try is this:

Replace imageLayout in 01_Convolution.ndl from cudnn to legacy. This will switch to less efficient implementation of convolution routines which should run on Fermi GPU.
Re-run CIFAR_convert.py with legacy option: python CIFAR_convert.py -f legacy. This will convert CIFAR-10 data to legacy-compatible data layout (aka HWC).

— Reply to this email directly or view it on GitHubhttps://github.com/Microsoft/CNTK/issues/194#issuecomment-191882345.

The outcome exception is sadly still the same:

EXCEPTION occurred: Convolution operation currently only supports 1D or 2D convolution on 3D tensors.


`-------------------------------------------------------------------
Build info: 

        Built time: Feb  4 2016 16:45:41
        Last modified date: Fri Jan 29 09:00:05 2016
        CUDA_PATH: C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v7.0
        CUB_PATH: C:\NVIDIA\cub-1.4.1
        CUDNN_PATH: C:\NVIDIA\cudnn-4.0\cuda
        Build Branch: 
        Build SHA1: 
        Built by gmaerz on DE00-IN484-L           
        Build Path: C:\Users\gmaerz\OneDrive\CNTK\Source\CNTK\
-------------------------------------------------------------------
running on DE00-IN484-L at 2016/03/03 20:49:34
command line: 
C:\Users\gmaerz\OneDrive\CNTK\x64\Debug\CNTK.exe configFile=01_Conv.config configName=01_Conv 

>>>>>>>>>>>>>>>>>>>> RAW CONFIG (VARIABLES NOT RESOLVED) >>>>>>>>>>>>>>>>>>>>
RootDir = "."
ConfigDir = "$RootDir$"
DataDir = "$RootDir$"
OutputDir = "$RootDir$/Output"
ModelDir = "$OutputDir$/Models"
ndlMacros=$ConfigDir$/Macros.ndl
precision=float
deviceId=Auto
prefetch=true
command=Train:Test
stderr=$OutputDir$/01_Conv
traceLevel=1
numMBsToShowResult=500
Train=[
    action=train
    modelPath=$ModelDir$/01_Convolution
     NDLNetworkBuilder=[
        networkDescription=$ConfigDir$/01_Convolution.ndl
    ]
    SGD=[
        epochSize=49984
        minibatchSize=64
        learningRatesPerMB=0.01*10:0.003*10:0.001
        momentumPerMB=0.9*20:0.99
        maxEpochs=30
        L2RegWeight=0.03
        dropoutRate=0*5:0.5
    ]
    reader=[
        readerType=UCIFastReader
        file=$DataDir$/Train.txt
        randomize=None
        features=[
            dim=3072
            start=1
        ]
        labels=[
            dim=1
            start=0
            labelDim=10
            labelMappingFile=$DataDir$/labelsmap.txt
        ]
    ]    
]
Test=[
    action=test
    modelPath=$ModelDir$/01_Convolution
    minibatchSize=16
     NDLNetworkBuilder=[
        networkDescription=$ConfigDir$/01_Convolution.ndl
    ]
    reader=[
        readerType=UCIFastReader
        file=$DataDir$/Test.txt
        randomize=None
        features=[
            dim=3072
            start=1
        ]
        labels=[
            dim=1
            start=0
            labelDim=10
            labelMappingFile=$DataDir$/labelsmap.txt
        ]
    ]    
]
configName=01_Conv

<<<<<<<<<<<<<<<<<<<< RAW CONFIG (VARIABLES NOT RESOLVED)  <<<<<<<<<<<<<<<<<<<<

>>>>>>>>>>>>>>>>>>>> RAW CONFIG WITH ALL VARIABLES RESOLVED >>>>>>>>>>>>>>>>>>>>
RootDir = "."
ConfigDir = "."
DataDir = "."
OutputDir = "./Output"
ModelDir = "./Output/Models"
ndlMacros=./Macros.ndl
precision=float
deviceId=Auto
prefetch=true
command=Train:Test
stderr=./Output/01_Conv
traceLevel=1
numMBsToShowResult=500
Train=[
    action=train
    modelPath=./Output/Models/01_Convolution
     NDLNetworkBuilder=[
        networkDescription=./01_Convolution.ndl
    ]
    SGD=[
        epochSize=49984
        minibatchSize=64
        learningRatesPerMB=0.01*10:0.003*10:0.001
        momentumPerMB=0.9*20:0.99
        maxEpochs=30
        L2RegWeight=0.03
        dropoutRate=0*5:0.5
    ]
    reader=[
        readerType=UCIFastReader
        file=./Train.txt
        randomize=None
        features=[
            dim=3072
            start=1
        ]
        labels=[
            dim=1
            start=0
            labelDim=10
            labelMappingFile=./labelsmap.txt
        ]
    ]    
]
Test=[
    action=test
    modelPath=./Output/Models/01_Convolution
    minibatchSize=16
     NDLNetworkBuilder=[
        networkDescription=./01_Convolution.ndl
    ]
    reader=[
        readerType=UCIFastReader
        file=./Test.txt
        randomize=None
        features=[
            dim=3072
            start=1
        ]
        labels=[
            dim=1
            start=0
            labelDim=10
            labelMappingFile=./labelsmap.txt
        ]
    ]    
]
configName=01_Conv

<<<<<<<<<<<<<<<<<<<< RAW CONFIG WITH ALL VARIABLES RESOLVED <<<<<<<<<<<<<<<<<<<<

>>>>>>>>>>>>>>>>>>>> PROCESSED CONFIG WITH ALL VARIABLES RESOLVED >>>>>>>>>>>>>>>>>>>>
configparameters: 01_Conv.config:command=Train:Test
configparameters: 01_Conv.config:ConfigDir=.
configparameters: 01_Conv.config:configName=01_Conv
configparameters: 01_Conv.config:DataDir=.
configparameters: 01_Conv.config:deviceId=Auto
configparameters: 01_Conv.config:ModelDir=./Output/Models
configparameters: 01_Conv.config:ndlMacros=./Macros.ndl
configparameters: 01_Conv.config:numMBsToShowResult=500
configparameters: 01_Conv.config:OutputDir=./Output
configparameters: 01_Conv.config:precision=float
configparameters: 01_Conv.config:prefetch=true
configparameters: 01_Conv.config:RootDir=.
configparameters: 01_Conv.config:stderr=./Output/01_Conv
configparameters: 01_Conv.config:Test=[
    action=test
    modelPath=./Output/Models/01_Convolution
    minibatchSize=16
     NDLNetworkBuilder=[
        networkDescription=./01_Convolution.ndl
    ]
    reader=[
        readerType=UCIFastReader
        file=./Test.txt
        randomize=None
        features=[
            dim=3072
            start=1
        ]
        labels=[
            dim=1
            start=0
            labelDim=10
            labelMappingFile=./labelsmap.txt
        ]
    ]    
]

configparameters: 01_Conv.config:traceLevel=1
configparameters: 01_Conv.config:Train=[
    action=train
    modelPath=./Output/Models/01_Convolution
     NDLNetworkBuilder=[
        networkDescription=./01_Convolution.ndl
    ]
    SGD=[
        epochSize=49984
        minibatchSize=64
        learningRatesPerMB=0.01*10:0.003*10:0.001
        momentumPerMB=0.9*20:0.99
        maxEpochs=30
        L2RegWeight=0.03
        dropoutRate=0*5:0.5
    ]
    reader=[
        readerType=UCIFastReader
        file=./Train.txt
        randomize=None
        features=[
            dim=3072
            start=1
        ]
        labels=[
            dim=1
            start=0
            labelDim=10
            labelMappingFile=./labelsmap.txt
        ]
    ]    
]

<<<<<<<<<<<<<<<<<<<< PROCESSED CONFIG WITH ALL VARIABLES RESOLVED <<<<<<<<<<<<<<<<<<<<
command: Train Test 
precision = float
CNTKModelPath: ./Output/Models/01_Convolution
CNTKCommandTrainInfo: Train : 30
CNTKCommandTrainInfo: CNTKNoMoreCommands_Total : 30
CNTKCommandTrainBegin: Train
LockDevice: Locked GPU 0 to test availability.
LockDevice: Unlocked GPU 0 after testing.
LockDevice: Locked GPU 0 for exclusive use.
NDLBuilder Using GPU 0
Reading UCI file ./Train.txt
WARNING: trying to use cuDNN on unsupported platform. It is safe to ignore the warning if it's produced during model editing command.
WARNING: trying to use cuDNN on unsupported platform. It is safe to ignore the warning if it's produced during model editing command.
WARNING: trying to use cuDNN on unsupported platform. It is safe to ignore the warning if it's produced during model editing command.
Microsoft::MSR::CNTK::GPUMatrix<ElemType>::SetGaussianRandomValue (GPU): creating curand object with seed 1, sizeof(ElemType)==4

Post-processing network...

3 roots:
    CE = CrossEntropyWithSoftmax
    Err = ErrorPrediction
    OutputNodes.z = Plus
FormNestedNetwork: WARNING: Was called twice for CE CrossEntropyWithSoftmax operation
FormNestedNetwork: WARNING: Was called twice for Err ErrorPrediction operation
FormNestedNetwork: WARNING: Was called twice for OutputNodes.z Plus operation

Validating for node CE. 33 nodes to process in pass 1.

Validating --> labels = InputValue -> [10 {1} x *]
Validating --> OutputNodes.W = LearnableParameter -> [10 x 64 {1,10}]
Validating --> h1.W = LearnableParameter -> [64 x 576 {1,64}]
Validating --> conv3_act.W = LearnableParameter -> [64 x 800 {1,64}]
Validating --> conv2_act.W = LearnableParameter -> [32 x 800 {1,32}]
Validating --> conv1_act.W = LearnableParameter -> [32 x 75 {1,32}]
Validating --> features = InputValue -> [3 x 32 x 32 {1,3,96} x *]
Validating --> featOffs = LearnableParameter -> [1 x 1 {1,1}]
Validating --> featScaled = Minus(features[3 x 32 x 32 {1,3,96} x * {W=32, H=32, C=3}], featOffs[1 x 1 {1,1}]) -> [3 x 32 x 32 {1,3,96} x *]
Validating --> conv1_act.c = Convolution(conv1_act.W[32 x 75 {1,32}], featScaled[3 x 32 x 32 {1,3,96} x * {W=32, H=32, C=3}]) -> [3 x 32 x 32 {1,3,96} x *]
Validating --> conv1_act.b = LearnableParameter -> [1 x 1 x 32 x 1 {1,1,1,32}]
Validating --> conv1_act.p = Plus(conv1_act.c[3 x 32 x 32 {1,3,96} x * {W=32, H=32, C=3}], conv1_act.b[1 x 1 x 32 x 1 {1,1,1,32}]) -> [3 x 32 x 32 x 1 {1,3,96,3072} x *]
Validating --> conv1_act.y = RectifiedLinear(conv1_act.p[3 x 32 x 32 x 1 {1,3,96,3072} x *]) -> [3 x 32 x 32 x 1 {1,3,96,3072} x *]
Validating --> pool1 = MaxPooling(conv1_act.y[3 x 32 x 32 x 1 {1,3,96,3072} x *])About to throw exception 'Convolution operation currently only supports 1D or 2D convolution on 3D tensors.'

[CALL STACK]
    >Microsoft::MSR::CNTK::ThrowFormatted<std::invalid_argument>
    -Microsoft::MSR::CNTK::InvalidArgument<>
    -Microsoft::MSR::CNTK::ImageDimensions::ImageDimensions
    -Microsoft::MSR::CNTK::PoolingNodeBase<float>::Validate
    -Microsoft::MSR::CNTK::MaxPoolingNode<float>::Validate
    -Microsoft::MSR::CNTK::ComputationNetwork::ValidateNodes
    -Microsoft::MSR::CNTK::ComputationNetwork::ValidateSubNetwork
    -Microsoft::MSR::CNTK::ComputationNetwork::CompileNetwork
    -Microsoft::MSR::CNTK::NDLBuilder<float>::LoadFromConfig
    -Microsoft::MSR::CNTK::NDLBuilder<float>::LoadNetworkFromConfig
    -Microsoft::MSR::CNTK::NDLBuilder<float>::BuildNetworkFromDescription
    -<lambda_129c9d8d27039b4c7cf30f939660c017>::operator()
    -std::_Callable_obj<<lambda_129c9d8d27039b4c7cf30f939660c017>,0>::_ApplyX<std::shared_ptr<Microsoft::MSR::CNTK::ComputationNetwork>,int>
    -std::_Func_impl<std::_Callable_obj<<lambda_129c9d8d27039b4c7cf30f939660c017>,0>,std::allocator<std::_Func_class<std::shared_ptr<Microsoft::MSR::CNTK::ComputationNetwork>,int> >,std::shared_ptr<Microsoft::MSR::CNTK::ComputationNetwork>,int>::_Do_call
    -std::_Func_class<std::shared_ptr<Microsoft::MSR::CNTK::ComputationNetwork>,int>::operator()
    -Microsoft::MSR::CNTK::SGD<float>::Train
    -DoTrain<Microsoft::MSR::CNTK::ConfigParameters,float>
    -DoCommands<float>
    -wmainOldCNTKConfig

EXCEPTION occurred: Convolution operation currently only supports 1D or 2D convolution on 3D tensors.
-------------------------------------------------------------------
Usage: cntk configFile=yourConfigFile
For detailed information please consult the CNTK book
"An Introduction to Computational Networks and the Computational Network Toolkit"
-------------------------------------------------------------------
`

I see there is still a warning: WARNING: trying to use cuDNN on unsupported platform.. Can you please also update Macros.ndl? I will update these samples so it can be changed in one place, like what we have in MNIST samples.

Changes it, yet I still get the same exception. However, the WARNING: trying to use cuDNN on unsupported platform. seems to be fixed.

-------------------------------------------------------------------
Build info: 

        Built time: Feb  4 2016 16:45:41
        Last modified date: Fri Jan 29 09:00:05 2016
        CUDA_PATH: C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v7.0
        CUB_PATH: C:\NVIDIA\cub-1.4.1
        CUDNN_PATH: C:\NVIDIA\cudnn-4.0\cuda
        Build Branch: 
        Build SHA1: 
        Built by gmaerz on DE00-IN484-L           
        Build Path: C:\Users\gmaerz\OneDrive\CNTK\Source\CNTK\
-------------------------------------------------------------------
running on DE00-IN484-L at 2016/03/03 21:23:21
command line: 
C:\Users\gmaerz\OneDrive\CNTK\x64\Debug\CNTK.exe configFile=01_Conv.config configName=01_Conv 

>>>>>>>>>>>>>>>>>>>> RAW CONFIG (VARIABLES NOT RESOLVED) >>>>>>>>>>>>>>>>>>>>
RootDir = "."
ConfigDir = "$RootDir$"
DataDir = "$RootDir$"
OutputDir = "$RootDir$/Output"
ModelDir = "$OutputDir$/Models"
ndlMacros=$ConfigDir$/Macros.ndl
precision=float
deviceId="-1"
prefetch=true
command=Train:Test
stderr=$OutputDir$/01_Conv
traceLevel=1
numMBsToShowResult=500
Train=[
    action=train
    modelPath=$ModelDir$/01_Convolution
     NDLNetworkBuilder=[
        networkDescription=$ConfigDir$/01_Convolution.ndl
    ]
    SGD=[
        epochSize=49984
        minibatchSize=64
        learningRatesPerMB=0.01*10:0.003*10:0.001
        momentumPerMB=0.9*20:0.99
        maxEpochs=30
        L2RegWeight=0.03
        dropoutRate=0*5:0.5
    ]
    reader=[
        readerType=UCIFastReader
        file=$DataDir$/Train.txt
        randomize=None
        features=[
            dim=3072
            start=1
        ]
        labels=[
            dim=1
            start=0
            labelDim=10
            labelMappingFile=$DataDir$/labelsmap.txt
        ]
    ]    
]
Test=[
    action=test
    modelPath=$ModelDir$/01_Convolution
    minibatchSize=16
     NDLNetworkBuilder=[
        networkDescription=$ConfigDir$/01_Convolution.ndl
    ]
    reader=[
        readerType=UCIFastReader
        file=$DataDir$/Test.txt
        randomize=None
        features=[
            dim=3072
            start=1
        ]
        labels=[
            dim=1
            start=0
            labelDim=10
            labelMappingFile=$DataDir$/labelsmap.txt
        ]
    ]    
]
configName=01_Conv

<<<<<<<<<<<<<<<<<<<< RAW CONFIG (VARIABLES NOT RESOLVED)  <<<<<<<<<<<<<<<<<<<<

>>>>>>>>>>>>>>>>>>>> RAW CONFIG WITH ALL VARIABLES RESOLVED >>>>>>>>>>>>>>>>>>>>
RootDir = "."
ConfigDir = "."
DataDir = "."
OutputDir = "./Output"
ModelDir = "./Output/Models"
ndlMacros=./Macros.ndl
precision=float
deviceId="-1"
prefetch=true
command=Train:Test
stderr=./Output/01_Conv
traceLevel=1
numMBsToShowResult=500
Train=[
    action=train
    modelPath=./Output/Models/01_Convolution
     NDLNetworkBuilder=[
        networkDescription=./01_Convolution.ndl
    ]
    SGD=[
        epochSize=49984
        minibatchSize=64
        learningRatesPerMB=0.01*10:0.003*10:0.001
        momentumPerMB=0.9*20:0.99
        maxEpochs=30
        L2RegWeight=0.03
        dropoutRate=0*5:0.5
    ]
    reader=[
        readerType=UCIFastReader
        file=./Train.txt
        randomize=None
        features=[
            dim=3072
            start=1
        ]
        labels=[
            dim=1
            start=0
            labelDim=10
            labelMappingFile=./labelsmap.txt
        ]
    ]    
]
Test=[
    action=test
    modelPath=./Output/Models/01_Convolution
    minibatchSize=16
     NDLNetworkBuilder=[
        networkDescription=./01_Convolution.ndl
    ]
    reader=[
        readerType=UCIFastReader
        file=./Test.txt
        randomize=None
        features=[
            dim=3072
            start=1
        ]
        labels=[
            dim=1
            start=0
            labelDim=10
            labelMappingFile=./labelsmap.txt
        ]
    ]    
]
configName=01_Conv

<<<<<<<<<<<<<<<<<<<< RAW CONFIG WITH ALL VARIABLES RESOLVED <<<<<<<<<<<<<<<<<<<<

>>>>>>>>>>>>>>>>>>>> PROCESSED CONFIG WITH ALL VARIABLES RESOLVED >>>>>>>>>>>>>>>>>>>>
configparameters: 01_Conv.config:command=Train:Test
configparameters: 01_Conv.config:ConfigDir=.
configparameters: 01_Conv.config:configName=01_Conv
configparameters: 01_Conv.config:DataDir=.
configparameters: 01_Conv.config:deviceId=-1
configparameters: 01_Conv.config:ModelDir=./Output/Models
configparameters: 01_Conv.config:ndlMacros=./Macros.ndl
configparameters: 01_Conv.config:numMBsToShowResult=500
configparameters: 01_Conv.config:OutputDir=./Output
configparameters: 01_Conv.config:precision=float
configparameters: 01_Conv.config:prefetch=true
configparameters: 01_Conv.config:RootDir=.
configparameters: 01_Conv.config:stderr=./Output/01_Conv
configparameters: 01_Conv.config:Test=[
    action=test
    modelPath=./Output/Models/01_Convolution
    minibatchSize=16
     NDLNetworkBuilder=[
        networkDescription=./01_Convolution.ndl
    ]
    reader=[
        readerType=UCIFastReader
        file=./Test.txt
        randomize=None
        features=[
            dim=3072
            start=1
        ]
        labels=[
            dim=1
            start=0
            labelDim=10
            labelMappingFile=./labelsmap.txt
        ]
    ]    
]

configparameters: 01_Conv.config:traceLevel=1
configparameters: 01_Conv.config:Train=[
    action=train
    modelPath=./Output/Models/01_Convolution
     NDLNetworkBuilder=[
        networkDescription=./01_Convolution.ndl
    ]
    SGD=[
        epochSize=49984
        minibatchSize=64
        learningRatesPerMB=0.01*10:0.003*10:0.001
        momentumPerMB=0.9*20:0.99
        maxEpochs=30
        L2RegWeight=0.03
        dropoutRate=0*5:0.5
    ]
    reader=[
        readerType=UCIFastReader
        file=./Train.txt
        randomize=None
        features=[
            dim=3072
            start=1
        ]
        labels=[
            dim=1
            start=0
            labelDim=10
            labelMappingFile=./labelsmap.txt
        ]
    ]    
]

<<<<<<<<<<<<<<<<<<<< PROCESSED CONFIG WITH ALL VARIABLES RESOLVED <<<<<<<<<<<<<<<<<<<<
command: Train Test 
precision = float
CNTKModelPath: ./Output/Models/01_Convolution
CNTKCommandTrainInfo: Train : 30
CNTKCommandTrainInfo: CNTKNoMoreCommands_Total : 30
CNTKCommandTrainBegin: Train
NDLBuilder Using CPU
Reading UCI file ./Train.txt

Post-processing network...

3 roots:
    OutputNodes.z = Plus
    CE = CrossEntropyWithSoftmax
    Err = ErrorPrediction
FormNestedNetwork: WARNING: Was called twice for OutputNodes.z Plus operation
FormNestedNetwork: WARNING: Was called twice for CE CrossEntropyWithSoftmax operation
FormNestedNetwork: WARNING: Was called twice for Err ErrorPrediction operation

Validating for node OutputNodes.z. 31 nodes to process in pass 1.

Validating --> OutputNodes.W = LearnableParameter -> [10 x 64 {1,10}]
Validating --> h1.W = LearnableParameter -> [64 x 576 {1,64}]
Validating --> conv3_act.W = LearnableParameter -> [64 x 800 {1,64}]
Validating --> conv2_act.W = LearnableParameter -> [32 x 800 {1,32}]
Validating --> conv1_act.W = LearnableParameter -> [32 x 75 {1,32}]
Validating --> features = InputValue -> [3 x 32 x 32 {1,3,96} x *]
Validating --> featOffs = LearnableParameter -> [1 x 1 {1,1}]
Validating --> featScaled = Minus(features[3 x 32 x 32 {1,3,96} x * {W=32, H=32, C=3}], featOffs[1 x 1 {1,1}]) -> [3 x 32 x 32 {1,3,96} x *]
Validating --> conv1_act.c = Convolution(conv1_act.W[32 x 75 {1,32}], featScaled[3 x 32 x 32 {1,3,96} x * {W=32, H=32, C=3}]) -> [32 x 32 x 32 {1,32,1024} x *]
Validating --> conv1_act.b = LearnableParameter -> [32 x 1 x 1 x 1 {1,32,32,32}]
Validating --> conv1_act.p = Plus(conv1_act.c[32 x 32 x 32 {1,32,1024} x * {W=32, H=32, C=32}], conv1_act.b[32 x 1 x 1 x 1 {1,32,32,32}]) -> [32 x 32 x 32 x 1 {1,32,1024,32768} x *]
Validating --> conv1_act.y = RectifiedLinear(conv1_act.p[32 x 32 x 32 x 1 {1,32,1024,32768} x *]) -> [32 x 32 x 32 x 1 {1,32,1024,32768} x *]
Validating --> pool1 = MaxPooling(conv1_act.y[32 x 32 x 32 x 1 {1,32,1024,32768} x *])About to throw exception 'Convolution operation currently only supports 1D or 2D convolution on 3D tensors.'

[CALL STACK]
    >Microsoft::MSR::CNTK::ThrowFormatted<std::invalid_argument>
    -Microsoft::MSR::CNTK::InvalidArgument<>
    -Microsoft::MSR::CNTK::ImageDimensions::ImageDimensions
    -Microsoft::MSR::CNTK::PoolingNodeBase<float>::Validate
    -Microsoft::MSR::CNTK::MaxPoolingNode<float>::Validate
    -Microsoft::MSR::CNTK::ComputationNetwork::ValidateNodes
    -Microsoft::MSR::CNTK::ComputationNetwork::ValidateSubNetwork
    -Microsoft::MSR::CNTK::ComputationNetwork::CompileNetwork
    -Microsoft::MSR::CNTK::NDLBuilder<float>::LoadFromConfig
    -Microsoft::MSR::CNTK::NDLBuilder<float>::LoadNetworkFromConfig
    -Microsoft::MSR::CNTK::NDLBuilder<float>::BuildNetworkFromDescription
    -<lambda_129c9d8d27039b4c7cf30f939660c017>::operator()
    -std::_Callable_obj<<lambda_129c9d8d27039b4c7cf30f939660c017>,0>::_ApplyX<std::shared_ptr<Microsoft::MSR::CNTK::ComputationNetwork>,int>
    -std::_Func_impl<std::_Callable_obj<<lambda_129c9d8d27039b4c7cf30f939660c017>,0>,std::allocator<std::_Func_class<std::shared_ptr<Microsoft::MSR::CNTK::ComputationNetwork>,int> >,std::shared_ptr<Microsoft::MSR::CNTK::ComputationNetwork>,int>::_Do_call
    -std::_Func_class<std::shared_ptr<Microsoft::MSR::CNTK::ComputationNetwork>,int>::operator()
    -Microsoft::MSR::CNTK::SGD<float>::Train
    -DoTrain<Microsoft::MSR::CNTK::ConfigParameters,float>
    -DoCommands<float>
    -wmainOldCNTKConfig

EXCEPTION occurred: Convolution operation currently only supports 1D or 2D convolution on 3D tensors.
-------------------------------------------------------------------
Usage: cntk configFile=yourConfigFile
For detailed information please consult the CNTK book
"An Introduction to Computational Networks and the Computational Network Toolkit"
-------------------------------------------------------------------

Ok, one more try: in Macros.ndl, in ConvReLULayer macro, change b = ImageParameter... to following:

    b = ImageParameter(outMap, 1, 1, init = fixedValue, value = bValue, imageLayout = "legacy")

Seems to be still the same:

-------------------------------------------------------------------
Build info: 

        Built time: Feb  4 2016 16:45:41
        Last modified date: Fri Jan 29 09:00:05 2016
        CUDA_PATH: C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v7.0
        CUB_PATH: C:\NVIDIA\cub-1.4.1
        CUDNN_PATH: C:\NVIDIA\cudnn-4.0\cuda
        Build Branch: 
        Build SHA1: 
        Built by gmaerz on DE00-IN484-L           
        Build Path: C:\Users\gmaerz\OneDrive\CNTK\Source\CNTK\
-------------------------------------------------------------------
running on DE00-IN484-L at 2016/03/03 21:48:02
command line: 
C:\Users\gmaerz\OneDrive\CNTK\x64\Debug\CNTK.exe configFile=01_Conv.config configName=01_Conv 

>>>>>>>>>>>>>>>>>>>> RAW CONFIG (VARIABLES NOT RESOLVED) >>>>>>>>>>>>>>>>>>>>
RootDir = "."
ConfigDir = "$RootDir$"
DataDir = "$RootDir$"
OutputDir = "$RootDir$/Output"
ModelDir = "$OutputDir$/Models"
ndlMacros=$ConfigDir$/Macros.ndl
precision=float
deviceId="-1"
prefetch=true
command=Train:Test
stderr=$OutputDir$/01_Conv
traceLevel=1
numMBsToShowResult=500
Train=[
    action=train
    modelPath=$ModelDir$/01_Convolution
     NDLNetworkBuilder=[
        networkDescription=$ConfigDir$/01_Convolution.ndl
    ]
    SGD=[
        epochSize=49984
        minibatchSize=64
        learningRatesPerMB=0.01*10:0.003*10:0.001
        momentumPerMB=0.9*20:0.99
        maxEpochs=30
        L2RegWeight=0.03
        dropoutRate=0*5:0.5
    ]
    reader=[
        readerType=UCIFastReader
        file=$DataDir$/Train.txt
        randomize=None
        features=[
            dim=3072
            start=1
        ]
        labels=[
            dim=1
            start=0
            labelDim=10
            labelMappingFile=$DataDir$/labelsmap.txt
        ]
    ]    
]
Test=[
    action=test
    modelPath=$ModelDir$/01_Convolution
    minibatchSize=16
     NDLNetworkBuilder=[
        networkDescription=$ConfigDir$/01_Convolution.ndl
    ]
    reader=[
        readerType=UCIFastReader
        file=$DataDir$/Test.txt
        randomize=None
        features=[
            dim=3072
            start=1
        ]
        labels=[
            dim=1
            start=0
            labelDim=10
            labelMappingFile=$DataDir$/labelsmap.txt
        ]
    ]    
]
configName=01_Conv

<<<<<<<<<<<<<<<<<<<< RAW CONFIG (VARIABLES NOT RESOLVED)  <<<<<<<<<<<<<<<<<<<<

>>>>>>>>>>>>>>>>>>>> RAW CONFIG WITH ALL VARIABLES RESOLVED >>>>>>>>>>>>>>>>>>>>
RootDir = "."
ConfigDir = "."
DataDir = "."
OutputDir = "./Output"
ModelDir = "./Output/Models"
ndlMacros=./Macros.ndl
precision=float
deviceId="-1"
prefetch=true
command=Train:Test
stderr=./Output/01_Conv
traceLevel=1
numMBsToShowResult=500
Train=[
    action=train
    modelPath=./Output/Models/01_Convolution
     NDLNetworkBuilder=[
        networkDescription=./01_Convolution.ndl
    ]
    SGD=[
        epochSize=49984
        minibatchSize=64
        learningRatesPerMB=0.01*10:0.003*10:0.001
        momentumPerMB=0.9*20:0.99
        maxEpochs=30
        L2RegWeight=0.03
        dropoutRate=0*5:0.5
    ]
    reader=[
        readerType=UCIFastReader
        file=./Train.txt
        randomize=None
        features=[
            dim=3072
            start=1
        ]
        labels=[
            dim=1
            start=0
            labelDim=10
            labelMappingFile=./labelsmap.txt
        ]
    ]    
]
Test=[
    action=test
    modelPath=./Output/Models/01_Convolution
    minibatchSize=16
     NDLNetworkBuilder=[
        networkDescription=./01_Convolution.ndl
    ]
    reader=[
        readerType=UCIFastReader
        file=./Test.txt
        randomize=None
        features=[
            dim=3072
            start=1
        ]
        labels=[
            dim=1
            start=0
            labelDim=10
            labelMappingFile=./labelsmap.txt
        ]
    ]    
]
configName=01_Conv

<<<<<<<<<<<<<<<<<<<< RAW CONFIG WITH ALL VARIABLES RESOLVED <<<<<<<<<<<<<<<<<<<<

>>>>>>>>>>>>>>>>>>>> PROCESSED CONFIG WITH ALL VARIABLES RESOLVED >>>>>>>>>>>>>>>>>>>>
configparameters: 01_Conv.config:command=Train:Test
configparameters: 01_Conv.config:ConfigDir=.
configparameters: 01_Conv.config:configName=01_Conv
configparameters: 01_Conv.config:DataDir=.
configparameters: 01_Conv.config:deviceId=-1
configparameters: 01_Conv.config:ModelDir=./Output/Models
configparameters: 01_Conv.config:ndlMacros=./Macros.ndl
configparameters: 01_Conv.config:numMBsToShowResult=500
configparameters: 01_Conv.config:OutputDir=./Output
configparameters: 01_Conv.config:precision=float
configparameters: 01_Conv.config:prefetch=true
configparameters: 01_Conv.config:RootDir=.
configparameters: 01_Conv.config:stderr=./Output/01_Conv
configparameters: 01_Conv.config:Test=[
    action=test
    modelPath=./Output/Models/01_Convolution
    minibatchSize=16
     NDLNetworkBuilder=[
        networkDescription=./01_Convolution.ndl
    ]
    reader=[
        readerType=UCIFastReader
        file=./Test.txt
        randomize=None
        features=[
            dim=3072
            start=1
        ]
        labels=[
            dim=1
            start=0
            labelDim=10
            labelMappingFile=./labelsmap.txt
        ]
    ]    
]

configparameters: 01_Conv.config:traceLevel=1
configparameters: 01_Conv.config:Train=[
    action=train
    modelPath=./Output/Models/01_Convolution
     NDLNetworkBuilder=[
        networkDescription=./01_Convolution.ndl
    ]
    SGD=[
        epochSize=49984
        minibatchSize=64
        learningRatesPerMB=0.01*10:0.003*10:0.001
        momentumPerMB=0.9*20:0.99
        maxEpochs=30
        L2RegWeight=0.03
        dropoutRate=0*5:0.5
    ]
    reader=[
        readerType=UCIFastReader
        file=./Train.txt
        randomize=None
        features=[
            dim=3072
            start=1
        ]
        labels=[
            dim=1
            start=0
            labelDim=10
            labelMappingFile=./labelsmap.txt
        ]
    ]    
]

<<<<<<<<<<<<<<<<<<<< PROCESSED CONFIG WITH ALL VARIABLES RESOLVED <<<<<<<<<<<<<<<<<<<<
command: Train Test 
precision = float
CNTKModelPath: ./Output/Models/01_Convolution
CNTKCommandTrainInfo: Train : 30
CNTKCommandTrainInfo: CNTKNoMoreCommands_Total : 30
CNTKCommandTrainBegin: Train
NDLBuilder Using CPU
Reading UCI file ./Train.txt

Post-processing network...

3 roots:
    OutputNodes.z = Plus
    CE = CrossEntropyWithSoftmax
    Err = ErrorPrediction
FormNestedNetwork: WARNING: Was called twice for OutputNodes.z Plus operation
FormNestedNetwork: WARNING: Was called twice for CE CrossEntropyWithSoftmax operation
FormNestedNetwork: WARNING: Was called twice for Err ErrorPrediction operation

Validating for node OutputNodes.z. 31 nodes to process in pass 1.

Validating --> OutputNodes.W = LearnableParameter -> [10 x 64 {1,10}]
Validating --> h1.W = LearnableParameter -> [64 x 576 {1,64}]
Validating --> conv3_act.W = LearnableParameter -> [64 x 800 {1,64}]
Validating --> conv2_act.W = LearnableParameter -> [32 x 800 {1,32}]
Validating --> conv1_act.W = LearnableParameter -> [32 x 75 {1,32}]
Validating --> features = InputValue -> [3 x 32 x 32 {1,3,96} x *]
Validating --> featOffs = LearnableParameter -> [1 x 1 {1,1}]
Validating --> featScaled = Minus(features[3 x 32 x 32 {1,3,96} x * {W=32, H=32, C=3}], featOffs[1 x 1 {1,1}]) -> [3 x 32 x 32 {1,3,96} x *]
Validating --> conv1_act.c = Convolution(conv1_act.W[32 x 75 {1,32}], featScaled[3 x 32 x 32 {1,3,96} x * {W=32, H=32, C=3}]) -> [32 x 32 x 32 {1,32,1024} x *]
Validating --> conv1_act.b = LearnableParameter -> [1 x 32 x 1 x 1 {1,1,32,32}]
Validating --> conv1_act.p = Plus(conv1_act.c[32 x 32 x 32 {1,32,1024} x * {W=32, H=32, C=32}], conv1_act.b[1 x 32 x 1 x 1 {1,1,32,32}]) -> [32 x 32 x 32 x 1 {1,32,1024,32768} x *]
Validating --> conv1_act.y = RectifiedLinear(conv1_act.p[32 x 32 x 32 x 1 {1,32,1024,32768} x *]) -> [32 x 32 x 32 x 1 {1,32,1024,32768} x *]
Validating --> pool1 = MaxPooling(conv1_act.y[32 x 32 x 32 x 1 {1,32,1024,32768} x *])About to throw exception 'Convolution operation currently only supports 1D or 2D convolution on 3D tensors.'

[CALL STACK]
    >Microsoft::MSR::CNTK::ThrowFormatted<std::invalid_argument>
    -Microsoft::MSR::CNTK::InvalidArgument<>
    -Microsoft::MSR::CNTK::ImageDimensions::ImageDimensions
    -Microsoft::MSR::CNTK::PoolingNodeBase<float>::Validate
    -Microsoft::MSR::CNTK::MaxPoolingNode<float>::Validate
    -Microsoft::MSR::CNTK::ComputationNetwork::ValidateNodes
    -Microsoft::MSR::CNTK::ComputationNetwork::ValidateSubNetwork
    -Microsoft::MSR::CNTK::ComputationNetwork::CompileNetwork
    -Microsoft::MSR::CNTK::NDLBuilder<float>::LoadFromConfig
    -Microsoft::MSR::CNTK::NDLBuilder<float>::LoadNetworkFromConfig
    -Microsoft::MSR::CNTK::NDLBuilder<float>::BuildNetworkFromDescription
    -<lambda_129c9d8d27039b4c7cf30f939660c017>::operator()
    -std::_Callable_obj<<lambda_129c9d8d27039b4c7cf30f939660c017>,0>::_ApplyX<std::shared_ptr<Microsoft::MSR::CNTK::ComputationNetwork>,int>
    -std::_Func_impl<std::_Callable_obj<<lambda_129c9d8d27039b4c7cf30f939660c017>,0>,std::allocator<std::_Func_class<std::shared_ptr<Microsoft::MSR::CNTK::ComputationNetwork>,int> >,std::shared_ptr<Microsoft::MSR::CNTK::ComputationNetwork>,int>::_Do_call
    -std::_Func_class<std::shared_ptr<Microsoft::MSR::CNTK::ComputationNetwork>,int>::operator()
    -Microsoft::MSR::CNTK::SGD<float>::Train
    -DoTrain<Microsoft::MSR::CNTK::ConfigParameters,float>
    -DoCommands<float>
    -wmainOldCNTKConfig

EXCEPTION occurred: Convolution operation currently only supports 1D or 2D convolution on 3D tensors.
-------------------------------------------------------------------
Usage: cntk configFile=yourConfigFile
For detailed information please consult the CNTK book
"An Introduction to Computational Networks and the Computational Network Toolkit"
-------------------------------------------------------------------

Hmm, conv1_act.b does not look right, that's for sure. When I run this sample with the latest code and legacy imageLayout (no other changes) on CPU (deviceId=-1) I get this (which is expected):

conv1_act.b = LearnableParameter -> [32 x 1 x 1]

In your log it's:

conv1_act.b = LearnableParameter -> [1 x 32 x 1 x 1 {1,1,32,32}]

Which is indeed a 4D tensor not compatible with our current convolution implementation. Is there any chance you can get the latest version from master, build and try again (please undo the change to ImageParameter that I suggested earlier)? Thanks and sorry about that.

No need for sorrys, I'm glad for such a helpful community here :)

In order to exclude some building failures, I've tried both rebuilding it with the newest pull and trying the newest binary gpu build(CNTK-2016-02-08-Windows-64bit-GPU).

After pulling the newest version and rebuilding it I got a new exception: EXCEPTION occurred: h1.t Times operation: Left [64 x 576 {1,64}] and right [3 x 3 x 64 {1,3,9}] operands' shapes are not compatible.

However, even more suprising the new binary version seems to be working.

Log of my own rebuild solution

-------------------------------------------------------------------
Build info: 

        Built time: Mar  3 2016 22:55:07
        Last modified date: Thu Mar  3 22:44:50 2016
        Build type: Debug
        Build target: GPU
        With 1bit-SGD: no
        CUDA_PATH: C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v7.0
        CUB_PATH: C:\NVIDIA\cub-1.4.1
        CUDNN_PATH: C:\NVIDIA\cudnn-4.0\cuda
        Build Branch: 
        Build SHA1:  (modified)
        Built by gmaerz on DE00-IN484-L
        Build Path: C:\Users\gmaerz\OneDrive\CNTK\Source\CNTK\
-------------------------------------------------------------------
running on DE00-IN484-L at 2016/03/03 23:50:59
command line: 
C:\Users\gmaerz\OneDrive\CNTK\x64\Debug\CNTK.exe  configFile=01_Conv.cntk

>>>>>>>>>>>>>>>>>>>> RAW CONFIG (VARIABLES NOT RESOLVED) >>>>>>>>>>>>>>>>>>>>
RootDir = "."
ConfigDir = "$RootDir$"
DataDir = "$RootDir$"
OutputDir = "$RootDir$/Output"
ModelDir = "$OutputDir$/Models"
ndlMacros="$ConfigDir$/Macros.ndl"
precision="float"
deviceId="auto"
prefetch="true"
command=Train:Test
modelPath="$ModelDir$/01_Convolution"
stderr="$OutputDir$/01_Conv"
traceLevel=1
numMBsToShowResult=500
Train=[
    action="train"
     NDLNetworkBuilder=[
        networkDescription="$ConfigDir$/01_Convolution.ndl"
    ]
    SGD=[
        epochSize=49984
        minibatchSize=64
        learningRatesPerMB=0.01*10:0.003*10:0.001
        momentumPerMB=0.9*20:0.99
        maxEpochs=30
        L2RegWeight=0.03
        dropoutRate=0*5:0.5
    ]
    reader=[
        readerType="UCIFastReader"
        file="$DataDir$/Train.txt"
        randomize="auto"
        features=[
            dim=3072
            start=1
        ]
        labels=[
            dim=1
            start=0
            labelDim=10
            labelMappingFile="$DataDir$/labelsmap.txt"
        ]
    ]    
]
Test=[
    action="test"
    minibatchSize=16
    reader=[
        readerType="UCIFastReader"
        file="$DataDir$/Test.txt"
        randomize="none"
        features=[
            dim=3072
            start=1
        ]
        labels=[
            dim=1
            start=0
            labelDim=10
            labelMappingFile="$DataDir$/labelsmap.txt"
        ]
    ]    
]

<<<<<<<<<<<<<<<<<<<< RAW CONFIG (VARIABLES NOT RESOLVED)  <<<<<<<<<<<<<<<<<<<<

>>>>>>>>>>>>>>>>>>>> RAW CONFIG WITH ALL VARIABLES RESOLVED >>>>>>>>>>>>>>>>>>>>
RootDir = "."
ConfigDir = "."
DataDir = "."
OutputDir = "./Output"
ModelDir = "./Output/Models"
ndlMacros="./Macros.ndl"
precision="float"
deviceId="auto"
prefetch="true"
command=Train:Test
modelPath="./Output/Models/01_Convolution"
stderr="./Output/01_Conv"
traceLevel=1
numMBsToShowResult=500
Train=[
    action="train"
     NDLNetworkBuilder=[
        networkDescription="./01_Convolution.ndl"
    ]
    SGD=[
        epochSize=49984
        minibatchSize=64
        learningRatesPerMB=0.01*10:0.003*10:0.001
        momentumPerMB=0.9*20:0.99
        maxEpochs=30
        L2RegWeight=0.03
        dropoutRate=0*5:0.5
    ]
    reader=[
        readerType="UCIFastReader"
        file="./Train.txt"
        randomize="auto"
        features=[
            dim=3072
            start=1
        ]
        labels=[
            dim=1
            start=0
            labelDim=10
            labelMappingFile="./labelsmap.txt"
        ]
    ]    
]
Test=[
    action="test"
    minibatchSize=16
    reader=[
        readerType="UCIFastReader"
        file="./Test.txt"
        randomize="none"
        features=[
            dim=3072
            start=1
        ]
        labels=[
            dim=1
            start=0
            labelDim=10
            labelMappingFile="./labelsmap.txt"
        ]
    ]    
]

<<<<<<<<<<<<<<<<<<<< RAW CONFIG WITH ALL VARIABLES RESOLVED <<<<<<<<<<<<<<<<<<<<

>>>>>>>>>>>>>>>>>>>> PROCESSED CONFIG WITH ALL VARIABLES RESOLVED >>>>>>>>>>>>>>>>>>>>
configparameters: 01_Conv.cntk:command=Train:Test
configparameters: 01_Conv.cntk:ConfigDir=.
configparameters: 01_Conv.cntk:DataDir=.
configparameters: 01_Conv.cntk:deviceId=auto
configparameters: 01_Conv.cntk:ModelDir=./Output/Models
configparameters: 01_Conv.cntk:modelPath=./Output/Models/01_Convolution
configparameters: 01_Conv.cntk:ndlMacros=./Macros.ndl
configparameters: 01_Conv.cntk:numMBsToShowResult=500
configparameters: 01_Conv.cntk:OutputDir=./Output
configparameters: 01_Conv.cntk:precision=float
configparameters: 01_Conv.cntk:prefetch=true
configparameters: 01_Conv.cntk:RootDir=.
configparameters: 01_Conv.cntk:stderr=./Output/01_Conv
configparameters: 01_Conv.cntk:Test=[
    action="test"
    minibatchSize=16
    reader=[
        readerType="UCIFastReader"
        file="./Test.txt"
        randomize="none"
        features=[
            dim=3072
            start=1
        ]
        labels=[
            dim=1
            start=0
            labelDim=10
            labelMappingFile="./labelsmap.txt"
        ]
    ]    
]

configparameters: 01_Conv.cntk:traceLevel=1
configparameters: 01_Conv.cntk:Train=[
    action="train"
     NDLNetworkBuilder=[
        networkDescription="./01_Convolution.ndl"
    ]
    SGD=[
        epochSize=49984
        minibatchSize=64
        learningRatesPerMB=0.01*10:0.003*10:0.001
        momentumPerMB=0.9*20:0.99
        maxEpochs=30
        L2RegWeight=0.03
        dropoutRate=0*5:0.5
    ]
    reader=[
        readerType="UCIFastReader"
        file="./Train.txt"
        randomize="auto"
        features=[
            dim=3072
            start=1
        ]
        labels=[
            dim=1
            start=0
            labelDim=10
            labelMappingFile="./labelsmap.txt"
        ]
    ]    
]

<<<<<<<<<<<<<<<<<<<< PROCESSED CONFIG WITH ALL VARIABLES RESOLVED <<<<<<<<<<<<<<<<<<<<
Commands: Train Test 
Precision = "float"
CNTKModelPath: ./Output/Models/01_Convolution
CNTKCommandTrainInfo: Train : 30
CNTKCommandTrainInfo: CNTKNoMoreCommands_Total : 30

##############################################################################
#                                                                            #
# Action "train"                                                             #
#                                                                            #
##############################################################################

CNTKCommandTrainBegin: Train
LockDevice: Locked GPU 0 to test availability.
LockDevice: Unlocked GPU 0 after testing.
LockDevice: Locked GPU 0 for exclusive use.
NDLBuilder Using GPU 0
Reading UCI file ./Train.txt
Microsoft::MSR::CNTK::GPUMatrix<ElemType>::SetGaussianRandomValue (GPU): creating curand object with seed 1, sizeof(ElemType)==4

Post-processing network...

3 roots:
    CE = CrossEntropyWithSoftmax
    Err = ErrorPrediction
    OutputNodes.z = Plus
FormNestedNetwork: WARNING: Was called twice for CE CrossEntropyWithSoftmax operation
FormNestedNetwork: WARNING: Was called twice for Err ErrorPrediction operation
FormNestedNetwork: WARNING: Was called twice for OutputNodes.z Plus operation

Validating network. 34 nodes to process in pass 1.

Validating --> labels = InputValue -> [10 {1} x *]
Validating --> OutputNodes.W = LearnableParameter -> [10 x 64 {1,10}]
Validating --> h1.W = LearnableParameter -> [64 x 576 {1,64}]
Validating --> conv3_act.W = LearnableParameter -> [64 x 800 {1,64}]
Validating --> conv2_act.W = LearnableParameter -> [32 x 800 {1,32}]
Validating --> conv1_act.W = LearnableParameter -> [32 x 75 {1,32}]
Validating --> features = InputValue -> [32 x 32 x 3 {1,32,1024} x *]
Validating --> featOffs = LearnableParameter -> [1 x 1 {1,1}]
Validating --> featScaled = Minus(features[32 x 32 x 3 {1,32,1024} x * {W=32, H=3, C=32}], featOffs[1 x 1 {1,1}]) -> [32 x 32 x 3 {1,32,1024} x *]
Validating --> conv1_act.c = Convolution(conv1_act.W[32 x 75 {1,32}], featScaled[32 x 32 x 3 {1,32,1024} x * {W=32, H=3, C=32}]) -> [32 x 32 x 32 {1,32,1024} x *]
Validating --> conv1_act.b = LearnableParameter -> [1 x 1 x 32 {1,1,1}]
Validating --> conv1_act.p = Plus(conv1_act.c[32 x 32 x 32 {1,32,1024} x * {W=32, H=32, C=32}], conv1_act.b[1 x 1 x 32 {1,1,1}]) -> [32 x 32 x 32 {1,32,1024} x *]
Validating --> conv1_act.y = RectifiedLinear(conv1_act.p[32 x 32 x 32 {1,32,1024} x * {W=32, H=32, C=32}]) -> [32 x 32 x 32 {1,32,1024} x *]
Validating --> pool1 = MaxPooling(conv1_act.y[32 x 32 x 32 {1,32,1024} x * {W=32, H=32, C=32}]) -> [15 x 15 x 32 {1,15,225} x *]
Validating --> conv2_act.c = Convolution(conv2_act.W[32 x 800 {1,32}], pool1[15 x 15 x 32 {1,15,225} x * {W=15, H=32, C=15}]) -> [15 x 15 x 32 {1,15,225} x *]
Validating --> conv2_act.b = LearnableParameter -> [1 x 1 x 32 {1,1,1}]
Validating --> conv2_act.p = Plus(conv2_act.c[15 x 15 x 32 {1,15,225} x * {W=15, H=32, C=15}], conv2_act.b[1 x 1 x 32 {1,1,1}]) -> [15 x 15 x 32 {1,15,225} x *]
Validating --> conv2_act.y = RectifiedLinear(conv2_act.p[15 x 15 x 32 {1,15,225} x * {W=15, H=32, C=15}]) -> [15 x 15 x 32 {1,15,225} x *]
Validating --> pool2 = MaxPooling(conv2_act.y[15 x 15 x 32 {1,15,225} x * {W=15, H=32, C=15}]) -> [7 x 7 x 32 {1,7,49} x *]
Validating --> conv3_act.c = Convolution(conv3_act.W[64 x 800 {1,64}], pool2[7 x 7 x 32 {1,7,49} x * {W=7, H=32, C=7}]) -> [7 x 7 x 64 {1,7,49} x *]
Validating --> conv3_act.b = LearnableParameter -> [1 x 1 x 64 {1,1,1}]
Validating --> conv3_act.p = Plus(conv3_act.c[7 x 7 x 64 {1,7,49} x * {W=7, H=64, C=7}], conv3_act.b[1 x 1 x 64 {1,1,1}]) -> [7 x 7 x 64 {1,7,49} x *]
Validating --> conv3_act.y = RectifiedLinear(conv3_act.p[7 x 7 x 64 {1,7,49} x * {W=7, H=64, C=7}]) -> [7 x 7 x 64 {1,7,49} x *]
Validating --> pool3 = MaxPooling(conv3_act.y[7 x 7 x 64 {1,7,49} x * {W=7, H=64, C=7}]) -> [3 x 3 x 64 {1,3,9} x *]
Validating --> h1.t = Times(h1.W[64 x 576 {1,64}], pool3[3 x 3 x 64 {1,3,9} x * {W=3, H=64, C=3}])
About to throw exception 'h1.t Times operation: Left [64 x 576 {1,64}] and right [3 x 3 x 64 {1,3,9}] operands' shapes are not compatible.'

EXCEPTION occurred: h1.t Times operation: Left [64 x 576 {1,64}] and right [3 x 3 x 64 {1,3,9}] operands' shapes are not compatible.

[CALL STACK]
    > Microsoft::MSR::CNTK::TimesNodeBase<float,0>::  Validate
    - Microsoft::MSR::CNTK::ComputationNetwork::  ValidateNodes
    - Microsoft::MSR::CNTK::ComputationNetwork::  ValidateNetwork
    - Microsoft::MSR::CNTK::ComputationNetwork::  CompileNetwork
    - Microsoft::MSR::CNTK::NDLBuilder<float>::  LoadFromConfig
    - Microsoft::MSR::CNTK::NDLBuilder<float>::  LoadNetworkFromConfig
    - Microsoft::MSR::CNTK::NDLBuilder<float>::  BuildNetworkFromDescription
    - <lambda_129c9d8d27039b4c7cf30f939660c017>::  operator  ()
    - std::_Callable_obj<<lambda_129c9d8d27039b4c7cf30f939660c017>,0>::_ApplyX<std::shared_ptr<Microsoft::MSR::CNTK::ComputationNetwork>,int>  
    - std::_Func_impl<std::_Callable_obj<<lambda_129c9d8d27039b4c7cf30f939660c017>,0>,std::allocator<std::_Func_class<std::shared_ptr<Microsoft::MSR::CNTK::ComputationNetwork>,int>>,std::shared_ptr<Microsoft::MSR::CNTK::ComputationNetwork>,int>::  _Do_call
    - std::_Func_class<std::shared_ptr<Microsoft::MSR::CNTK::ComputationNetwork>,int>::  operator  ()
    - Microsoft::MSR::CNTK::SGD<float>::  Train
    - DoTrain<Microsoft::MSR::CNTK::ConfigParameters,float>  
    - DoCommands<float>  
    - wmainOldCNTKConfig
    - wmain1

Log of the binary CNTK-2016-02-08-Windows-64bit-GPU

-------------------------------------------------------------------
Build info: 

        Built time: Feb  8 2016 00:54:07
        Last modified date: Sun Feb  7 16:51:01 2016
        Build type: Unknown
        Build target: Unknown
        With 1bit-SGD: no
        CUDA_PATH: C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v7.0
        CUB_PATH: C:\src\cub-1.4.1
        CUDNN_PATH: C:\NVIDIA\cudnn-4.0\cuda
        Build Branch: HEAD
        Build SHA1: 2f9a48c71dc0a6097498cb7e90ac3b151ab536dd
        Built by svcphil on LIANA-09-w
        Build Path: c:\jenkins\workspace\CNTK-Build-Windows\Source\CNTK\
-------------------------------------------------------------------
running on DE00-IN484-L at 2016/03/03 22:58:50
command line: 
C:\temp\cntk\cntk\CNTK.exe configFile=01_Conv.cntk 

>>>>>>>>>>>>>>>>>>>> RAW CONFIG (VARIABLES NOT RESOLVED) >>>>>>>>>>>>>>>>>>>>
RootDir = "."
ConfigDir = "$RootDir$"
DataDir = "$RootDir$"
OutputDir = "$RootDir$/Output"
ModelDir = "$OutputDir$/Models"
ndlMacros="$ConfigDir$/Macros.ndl"
precision="float"
deviceId="auto"
prefetch="true"
command=Train:Test
modelPath="$ModelDir$/01_Convolution"
stderr="$OutputDir$/01_Conv"
traceLevel=1
numMBsToShowResult=500
Train=[
    action="train"
     NDLNetworkBuilder=[
        networkDescription="$ConfigDir$/01_Convolution.ndl"
    ]
    SGD=[
        epochSize=49984
        minibatchSize=64
        learningRatesPerMB=0.01*10:0.003*10:0.001
        momentumPerMB=0.9*20:0.99
        maxEpochs=30
        L2RegWeight=0.03
        dropoutRate=0*5:0.5
    ]
    reader=[
        readerType="UCIFastReader"
        file="$DataDir$/Train.txt"
        randomize="auto"
        features=[
            dim=3072
            start=1
        ]
        labels=[
            dim=1
            start=0
            labelDim=10
            labelMappingFile="$DataDir$/labelsmap.txt"
        ]
    ]    
]
Test=[
    action="test"
    minibatchSize=16
    reader=[
        readerType="UCIFastReader"
        file="$DataDir$/Test.txt"
        randomize="none"
        features=[
            dim=3072
            start=1
        ]
        labels=[
            dim=1
            start=0
            labelDim=10
            labelMappingFile="$DataDir$/labelsmap.txt"
        ]
    ]    
]

<<<<<<<<<<<<<<<<<<<< RAW CONFIG (VARIABLES NOT RESOLVED)  <<<<<<<<<<<<<<<<<<<<

>>>>>>>>>>>>>>>>>>>> RAW CONFIG WITH ALL VARIABLES RESOLVED >>>>>>>>>>>>>>>>>>>>
RootDir = "."
ConfigDir = "."
DataDir = "."
OutputDir = "./Output"
ModelDir = "./Output/Models"
ndlMacros="./Macros.ndl"
precision="float"
deviceId="auto"
prefetch="true"
command=Train:Test
modelPath="./Output/Models/01_Convolution"
stderr="./Output/01_Conv"
traceLevel=1
numMBsToShowResult=500
Train=[
    action="train"
     NDLNetworkBuilder=[
        networkDescription="./01_Convolution.ndl"
    ]
    SGD=[
        epochSize=49984
        minibatchSize=64
        learningRatesPerMB=0.01*10:0.003*10:0.001
        momentumPerMB=0.9*20:0.99
        maxEpochs=30
        L2RegWeight=0.03
        dropoutRate=0*5:0.5
    ]
    reader=[
        readerType="UCIFastReader"
        file="./Train.txt"
        randomize="auto"
        features=[
            dim=3072
            start=1
        ]
        labels=[
            dim=1
            start=0
            labelDim=10
            labelMappingFile="./labelsmap.txt"
        ]
    ]    
]
Test=[
    action="test"
    minibatchSize=16
    reader=[
        readerType="UCIFastReader"
        file="./Test.txt"
        randomize="none"
        features=[
            dim=3072
            start=1
        ]
        labels=[
            dim=1
            start=0
            labelDim=10
            labelMappingFile="./labelsmap.txt"
        ]
    ]    
]

<<<<<<<<<<<<<<<<<<<< RAW CONFIG WITH ALL VARIABLES RESOLVED <<<<<<<<<<<<<<<<<<<<

>>>>>>>>>>>>>>>>>>>> PROCESSED CONFIG WITH ALL VARIABLES RESOLVED >>>>>>>>>>>>>>>>>>>>
configparameters: 01_Conv.cntk:command=Train:Test
configparameters: 01_Conv.cntk:ConfigDir=.
configparameters: 01_Conv.cntk:DataDir=.
configparameters: 01_Conv.cntk:deviceId=auto
configparameters: 01_Conv.cntk:ModelDir=./Output/Models
configparameters: 01_Conv.cntk:modelPath=./Output/Models/01_Convolution
configparameters: 01_Conv.cntk:ndlMacros=./Macros.ndl
configparameters: 01_Conv.cntk:numMBsToShowResult=500
configparameters: 01_Conv.cntk:OutputDir=./Output
configparameters: 01_Conv.cntk:precision=float
configparameters: 01_Conv.cntk:prefetch=true
configparameters: 01_Conv.cntk:RootDir=.
configparameters: 01_Conv.cntk:stderr=./Output/01_Conv
configparameters: 01_Conv.cntk:Test=[
    action="test"
    minibatchSize=16
    reader=[
        readerType="UCIFastReader"
        file="./Test.txt"
        randomize="none"
        features=[
            dim=3072
            start=1
        ]
        labels=[
            dim=1
            start=0
            labelDim=10
            labelMappingFile="./labelsmap.txt"
        ]
    ]    
]

configparameters: 01_Conv.cntk:traceLevel=1
configparameters: 01_Conv.cntk:Train=[
    action="train"
     NDLNetworkBuilder=[
        networkDescription="./01_Convolution.ndl"
    ]
    SGD=[
        epochSize=49984
        minibatchSize=64
        learningRatesPerMB=0.01*10:0.003*10:0.001
        momentumPerMB=0.9*20:0.99
        maxEpochs=30
        L2RegWeight=0.03
        dropoutRate=0*5:0.5
    ]
    reader=[
        readerType="UCIFastReader"
        file="./Train.txt"
        randomize="auto"
        features=[
            dim=3072
            start=1
        ]
        labels=[
            dim=1
            start=0
            labelDim=10
            labelMappingFile="./labelsmap.txt"
        ]
    ]    
]

<<<<<<<<<<<<<<<<<<<< PROCESSED CONFIG WITH ALL VARIABLES RESOLVED <<<<<<<<<<<<<<<<<<<<
command: Train Test 
precision = float
CNTKModelPath: ./Output/Models/01_Convolution
CNTKCommandTrainInfo: Train : 30
CNTKCommandTrainInfo: CNTKNoMoreCommands_Total : 30
CNTKCommandTrainBegin: Train
LockDevice: Locked GPU 0 to test availability.
LockDevice: Unlocked GPU 0 after testing.
LockDevice: Locked GPU 0 for exclusive use.
NDLBuilder Using GPU 0
Reading UCI file ./Train.txt
WARNING: trying to use cuDNN on unsupported platform. It is safe to ignore the warning if it's produced during model editing command.
WARNING: trying to use cuDNN on unsupported platform. It is safe to ignore the warning if it's produced during model editing command.
WARNING: trying to use cuDNN on unsupported platform. It is safe to ignore the warning if it's produced during model editing command.
WARNING: trying to use cuDNN on unsupported platform. It is safe to ignore the warning if it's produced during model editing command.
WARNING: trying to use cuDNN on unsupported platform. It is safe to ignore the warning if it's produced during model editing command.
WARNING: trying to use cuDNN on unsupported platform. It is safe to ignore the warning if it's produced during model editing command.
Microsoft::MSR::CNTK::GPUMatrix<ElemType>::SetGaussianRandomValue (GPU): creating curand object with seed 1, sizeof(ElemType)==4

Post-processing network...

3 roots:
    CE = CrossEntropyWithSoftmax
    OutputNodes.z = Plus
    Err = ErrorPrediction
FormNestedNetwork: WARNING: Was called twice for CE CrossEntropyWithSoftmax operation
FormNestedNetwork: WARNING: Was called twice for OutputNodes.z Plus operation
FormNestedNetwork: WARNING: Was called twice for Err ErrorPrediction operation

Validating network. 34 nodes to process in pass 1.

Validating --> labels = InputValue -> [10 x *]
Validating --> OutputNodes.W = LearnableParameter -> [10 x 64]
Validating --> h1.W = LearnableParameter -> [64 x 576]
Validating --> conv3_act.W = LearnableParameter -> [64 x 800]
Validating --> conv2_act.W = LearnableParameter -> [32 x 800]
Validating --> conv1_act.W = LearnableParameter -> [32 x 75]
Validating --> features = InputValue -> [32 x 32 x 3 x *]
Validating --> featOffs = LearnableParameter -> [1 x 1]
Validating --> featScaled = Minus(features[32 x 32 x 3 x * {W=32, H=3, C=32}], featOffs[1 x 1]) -> [32 x 32 x 3 x *]
Validating --> conv1_act.c = Convolution(conv1_act.W[32 x 75], featScaled[32 x 32 x 3 x * {W=32, H=3, C=32}]) -> [32 x 32 x 32 x *]
Validating --> conv1_act.b = LearnableParameter -> [1 x 1 x 32]
Validating --> conv1_act.p = Plus(conv1_act.c[32 x 32 x 32 x * {W=32, H=32, C=32}], conv1_act.b[1 x 1 x 32]) -> [32 x 32 x 32 x *]
Validating --> conv1_act.y = RectifiedLinear(conv1_act.p[32 x 32 x 32 x * {W=32, H=32, C=32}]) -> [32 x 32 x 32 x *]
Validating --> pool1 = MaxPooling(conv1_act.y[32 x 32 x 32 x * {W=32, H=32, C=32}]) -> [15 x 15 x 32 x *]
Validating --> conv2_act.c = Convolution(conv2_act.W[32 x 800], pool1[15 x 15 x 32 x * {W=15, H=32, C=15}]) -> [15 x 15 x 32 x *]
Validating --> conv2_act.b = LearnableParameter -> [1 x 1 x 32]
Validating --> conv2_act.p = Plus(conv2_act.c[15 x 15 x 32 x * {W=15, H=32, C=15}], conv2_act.b[1 x 1 x 32]) -> [15 x 15 x 32 x *]
Validating --> conv2_act.y = RectifiedLinear(conv2_act.p[15 x 15 x 32 x * {W=15, H=32, C=15}]) -> [15 x 15 x 32 x *]
Validating --> pool2 = MaxPooling(conv2_act.y[15 x 15 x 32 x * {W=15, H=32, C=15}]) -> [7 x 7 x 32 x *]
Validating --> conv3_act.c = Convolution(conv3_act.W[64 x 800], pool2[7 x 7 x 32 x * {W=7, H=32, C=7}]) -> [7 x 7 x 64 x *]
Validating --> conv3_act.b = LearnableParameter -> [1 x 1 x 64]
Validating --> conv3_act.p = Plus(conv3_act.c[7 x 7 x 64 x * {W=7, H=64, C=7}], conv3_act.b[1 x 1 x 64]) -> [7 x 7 x 64 x *]
Validating --> conv3_act.y = RectifiedLinear(conv3_act.p[7 x 7 x 64 x * {W=7, H=64, C=7}]) -> [7 x 7 x 64 x *]
Validating --> pool3 = MaxPooling(conv3_act.y[7 x 7 x 64 x * {W=7, H=64, C=7}]) -> [3 x 3 x 64 x *]
Validating --> h1.t = Times(h1.W[64 x 576], pool3[3 x 3 x 64 x * {W=3, H=64, C=3}]) -> [64 x *]
Validating --> h1.b = LearnableParameter -> [64 x 1]
Validating --> h1.z = Plus(h1.t[64 x *], h1.b[64 x 1]) -> [64 x 1 x *]
Validating --> h1.y = RectifiedLinear(h1.z[64 x 1 x *]) -> [64 x 1 x *]
Validating --> h1_d = Dropout(h1.y[64 x 1 x *]) -> [64 x 1 x *]
Validating --> OutputNodes.t = Times(OutputNodes.W[10 x 64], h1_d[64 x 1 x *]) -> [10 x *]
Validating --> OutputNodes.b = LearnableParameter -> [10]
Validating --> OutputNodes.z = Plus(OutputNodes.t[10 x *], OutputNodes.b[10]) -> [10 x *]
Validating --> CE = CrossEntropyWithSoftmax(labels[10 x *], OutputNodes.z[10 x *]) -> [1]
Validating --> Err = ErrorPrediction(labels[10 x *], OutputNodes.z[10 x *]) -> [1]

Validating network. 21 nodes to process in pass 2.

Validating --> labels = InputValue -> [10 x *]
Validating --> OutputNodes.W = LearnableParameter -> [10 x 64]
Validating --> h1.W = LearnableParameter -> [64 x 576]
Validating --> conv3_act.W = LearnableParameter -> [64 x 800]
Validating --> conv2_act.W = LearnableParameter -> [32 x 800]
Validating --> conv1_act.W = LearnableParameter -> [32 x 75]
Validating --> features = InputValue -> [32 x 32 x 3 x *]
Validating --> featOffs = LearnableParameter -> [1 x 1]
Validating --> featScaled = Minus(features[32 x 32 x 3 x * {W=32, H=3, C=32}], featOffs[1 x 1]) -> [32 x 32 x 3 x *]
Validating --> conv1_act.c = Convolution(conv1_act.W[32 x 75], featScaled[32 x 32 x 3 x * {W=32, H=3, C=32}]) -> [32 x 32 x 32 x *]
Validating --> conv1_act.b = LearnableParameter -> [1 x 1 x 32]
Validating --> conv1_act.p = Plus(conv1_act.c[32 x 32 x 32 x * {W=32, H=32, C=32}], conv1_act.b[1 x 1 x 32]) -> [32 x 32 x 32 x *]
Validating --> conv1_act.y = RectifiedLinear(conv1_act.p[32 x 32 x 32 x * {W=32, H=32, C=32}]) -> [32 x 32 x 32 x *]
Validating --> pool1 = MaxPooling(conv1_act.y[32 x 32 x 32 x * {W=32, H=32, C=32}]) -> [15 x 15 x 32 x *]
Validating --> conv2_act.c = Convolution(conv2_act.W[32 x 800], pool1[15 x 15 x 32 x * {W=15, H=32, C=15}]) -> [15 x 15 x 32 x *]
Validating --> conv2_act.b = LearnableParameter -> [1 x 1 x 32]
Validating --> conv2_act.p = Plus(conv2_act.c[15 x 15 x 32 x * {W=15, H=32, C=15}], conv2_act.b[1 x 1 x 32]) -> [15 x 15 x 32 x *]
Validating --> conv2_act.y = RectifiedLinear(conv2_act.p[15 x 15 x 32 x * {W=15, H=32, C=15}]) -> [15 x 15 x 32 x *]
Validating --> pool2 = MaxPooling(conv2_act.y[15 x 15 x 32 x * {W=15, H=32, C=15}]) -> [7 x 7 x 32 x *]
Validating --> conv3_act.c = Convolution(conv3_act.W[64 x 800], pool2[7 x 7 x 32 x * {W=7, H=32, C=7}]) -> [7 x 7 x 64 x *]
Validating --> conv3_act.b = LearnableParameter -> [1 x 1 x 64]
Validating --> conv3_act.p = Plus(conv3_act.c[7 x 7 x 64 x * {W=7, H=64, C=7}], conv3_act.b[1 x 1 x 64]) -> [7 x 7 x 64 x *]
Validating --> conv3_act.y = RectifiedLinear(conv3_act.p[7 x 7 x 64 x * {W=7, H=64, C=7}]) -> [7 x 7 x 64 x *]
Validating --> pool3 = MaxPooling(conv3_act.y[7 x 7 x 64 x * {W=7, H=64, C=7}]) -> [3 x 3 x 64 x *]
Validating --> h1.t = Times(h1.W[64 x 576], pool3[3 x 3 x 64 x * {W=3, H=64, C=3}]) -> [64 x *]
Validating --> h1.b = LearnableParameter -> [64 x 1]
Validating --> h1.z = Plus(h1.t[64 x *], h1.b[64 x 1]) -> [64 x 1 x *]
Validating --> h1.y = RectifiedLinear(h1.z[64 x 1 x *]) -> [64 x 1 x *]
Validating --> h1_d = Dropout(h1.y[64 x 1 x *]) -> [64 x 1 x *]
Validating --> OutputNodes.t = Times(OutputNodes.W[10 x 64], h1_d[64 x 1 x *]) -> [10 x *]
Validating --> OutputNodes.b = LearnableParameter -> [10]
Validating --> OutputNodes.z = Plus(OutputNodes.t[10 x *], OutputNodes.b[10]) -> [10 x *]
Validating --> CE = CrossEntropyWithSoftmax(labels[10 x *], OutputNodes.z[10 x *]) -> [1]
Validating --> Err = ErrorPrediction(labels[10 x *], OutputNodes.z[10 x *]) -> [1]

Validating network, final pass.

Validating --> labels = InputValue -> [10 x *]
Validating --> OutputNodes.W = LearnableParameter -> [10 x 64]
Validating --> h1.W = LearnableParameter -> [64 x 576]
Validating --> conv3_act.W = LearnableParameter -> [64 x 800]
Validating --> conv2_act.W = LearnableParameter -> [32 x 800]
Validating --> conv1_act.W = LearnableParameter -> [32 x 75]
Validating --> features = InputValue -> [32 x 32 x 3 x *]
Validating --> featOffs = LearnableParameter -> [1 x 1]
Validating --> featScaled = Minus(features[32 x 32 x 3 x * {W=32, H=3, C=32}], featOffs[1 x 1]) -> [32 x 32 x 3 x *]
Validating --> conv1_act.c = Convolution(conv1_act.W[32 x 75], featScaled[32 x 32 x 3 x * {W=32, H=3, C=32}]) -> [32 x 32 x 32 x *]
Validating --> conv1_act.b = LearnableParameter -> [1 x 1 x 32]
Validating --> conv1_act.p = Plus(conv1_act.c[32 x 32 x 32 x * {W=32, H=32, C=32}], conv1_act.b[1 x 1 x 32]) -> [32 x 32 x 32 x *]
Validating --> conv1_act.y = RectifiedLinear(conv1_act.p[32 x 32 x 32 x * {W=32, H=32, C=32}]) -> [32 x 32 x 32 x *]
Validating --> pool1 = MaxPooling(conv1_act.y[32 x 32 x 32 x * {W=32, H=32, C=32}]) -> [15 x 15 x 32 x *]
Validating --> conv2_act.c = Convolution(conv2_act.W[32 x 800], pool1[15 x 15 x 32 x * {W=15, H=32, C=15}]) -> [15 x 15 x 32 x *]
Validating --> conv2_act.b = LearnableParameter -> [1 x 1 x 32]
Validating --> conv2_act.p = Plus(conv2_act.c[15 x 15 x 32 x * {W=15, H=32, C=15}], conv2_act.b[1 x 1 x 32]) -> [15 x 15 x 32 x *]
Validating --> conv2_act.y = RectifiedLinear(conv2_act.p[15 x 15 x 32 x * {W=15, H=32, C=15}]) -> [15 x 15 x 32 x *]
Validating --> pool2 = MaxPooling(conv2_act.y[15 x 15 x 32 x * {W=15, H=32, C=15}]) -> [7 x 7 x 32 x *]
Validating --> conv3_act.c = Convolution(conv3_act.W[64 x 800], pool2[7 x 7 x 32 x * {W=7, H=32, C=7}]) -> [7 x 7 x 64 x *]
Validating --> conv3_act.b = LearnableParameter -> [1 x 1 x 64]
Validating --> conv3_act.p = Plus(conv3_act.c[7 x 7 x 64 x * {W=7, H=64, C=7}], conv3_act.b[1 x 1 x 64]) -> [7 x 7 x 64 x *]
Validating --> conv3_act.y = RectifiedLinear(conv3_act.p[7 x 7 x 64 x * {W=7, H=64, C=7}]) -> [7 x 7 x 64 x *]
Validating --> pool3 = MaxPooling(conv3_act.y[7 x 7 x 64 x * {W=7, H=64, C=7}]) -> [3 x 3 x 64 x *]
Validating --> h1.t = Times(h1.W[64 x 576], pool3[3 x 3 x 64 x * {W=3, H=64, C=3}]) -> [64 x *]
Validating --> h1.b = LearnableParameter -> [64 x 1]
Validating --> h1.z = Plus(h1.t[64 x *], h1.b[64 x 1]) -> [64 x 1 x *]
Validating --> h1.y = RectifiedLinear(h1.z[64 x 1 x *]) -> [64 x 1 x *]
Validating --> h1_d = Dropout(h1.y[64 x 1 x *]) -> [64 x 1 x *]
Validating --> OutputNodes.t = Times(OutputNodes.W[10 x 64], h1_d[64 x 1 x *]) -> [10 x *]
Validating --> OutputNodes.b = LearnableParameter -> [10]
Validating --> OutputNodes.z = Plus(OutputNodes.t[10 x *], OutputNodes.b[10]) -> [10 x *]
Validating --> CE = CrossEntropyWithSoftmax(labels[10 x *], OutputNodes.z[10 x *]) -> [1]
Validating --> Err = ErrorPrediction(labels[10 x *], OutputNodes.z[10 x *]) -> [1]

13 out of 34 nodes do not share the minibatch layout with the input data.

Post-processing network complete.

SGD using GPU 0.

Training criterion node(s):
    CE = CrossEntropyWithSoftmax

Evaluation criterion node(s):
    Err = ErrorPrediction

Allocating matrices for forward and/or backward propagation.
No PreCompute nodes found, skipping PreCompute step
Set Max Temp Mem Size For Convolution Nodes to 0 samples.
Starting Epoch 1: learning rate per sample = 0.000156  effective momentum = 0.900000  momentum as time constant = 607.4 samples
UCIFastReader: Starting at epoch 0, counting lines to determine record count...
 50000 records found.
starting epoch 0 at record count 0, and file position 0
already there from last epoch

Starting minibatch loop.
RandomOrdering: 9955 retries for 49984 elements (19.9%) to ensure window condition
RandomOrdering: recached sequence for seed 0: 3894, 8746, ...
 Epoch[ 1 of 30]-Minibatch[   1- 500, 64.02%]: SamplesSeen = 32000; TrainLossPerSample =  1.67275720; EvalErr[0]PerSample = 0.60853125; TotalTime = 60.7581s; SamplesPerSecond = 526.7
Finished Epoch[ 1 of 30]: [Training Set] TrainLossPerSample = 1.5577571; EvalErrPerSample = 0.56322026; AvgLearningRatePerSample = 0.00015625; EpochTime=122.9
Starting Epoch 2: learning rate per sample = 0.000156  effective momentum = 0.900000  momentum as time constant = 607.4 samples
starting epoch 1 at record count 49984, and file position 49984
already there from last epoch

Starting minibatch loop.
RandomOrdering: 9883 retries for 49984 elements (19.8%) to ensure window condition
RandomOrdering: recached sequence for seed 1: 12346, 5813, ...
 Epoch[ 2 of 30]-Minibatch[   1- 500, 64.02%]: SamplesSeen = 31952; TrainLossPerSample =  1.18145064; EvalErr[0]PerSample = 0.41609289; TotalTime = 54.7537s; SamplesPerSecond = 583.6
Finished Epoch[ 2 of 30]: [Training Set] TrainLossPerSample = 1.1366092; EvalErrPerSample = 0.39934781; AvgLearningRatePerSample = 0.00015625; EpochTime=85.6537
Starting Epoch 3: learning rate per sample = 0.000156  effective momentum = 0.900000  momentum as time constant = 607.4 samples
starting epoch 2 at record count 99968, and file position 49968
already there from last epoch

Starting minibatch loop.
RandomOrdering: 9820 retries for 49984 elements (19.6%) to ensure window condition
RandomOrdering: recached sequence for seed 2: 21179, 10736, ...
 Epoch[ 3 of 30]-Minibatch[   1- 500, 64.02%]: SamplesSeen = 31968; TrainLossPerSample =  0.99013778; EvalErr[0]PerSample = 0.34700325; TotalTime = 54.7826s; SamplesPerSecond = 583.5
Finished Epoch[ 3 of 30]: [Training Set] TrainLossPerSample = 0.97390151; EvalErrPerSample = 0.33972871; AvgLearningRatePerSample = 0.00015625; EpochTime=85.6577
Starting Epoch 4: learning rate per sample = 0.000156  effective momentum = 0.900000  momentum as time constant = 607.4 samples
starting epoch 3 at record count 149952, and file position 49952
already there from last epoch

Starting minibatch loop.
RandomOrdering: 9821 retries for 49984 elements (19.6%) to ensure window condition
RandomOrdering: recached sequence for seed 3: 5516, 5884, ...
 Epoch[ 4 of 30]-Minibatch[   1- 500, 64.02%]: SamplesSeen = 31984; TrainLossPerSample =  0.88054947; EvalErr[0]PerSample = 0.30602801; TotalTime = 55.1802s; SamplesPerSecond = 579.6
Finished Epoch[ 4 of 30]: [Training Set] TrainLossPerSample = 0.87399435; EvalErrPerSample = 0.30381721; AvgLearningRatePerSample = 0.00015625; EpochTime=86.5706
Starting Epoch 5: learning rate per sample = 0.000156  effective momentum = 0.900000  momentum as time constant = 607.4 samples
starting epoch 4 at record count 199936, and file position 49936
already there from last epoch

Starting minibatch loop.
RandomOrdering: 10109 retries for 49984 elements (20.2%) to ensure window condition
RandomOrdering: recached sequence for seed 4: 2602, 18581, ...
 Epoch[ 5 of 30]-Minibatch[   1- 500, 64.02%]: SamplesSeen = 32000; TrainLossPerSample =  0.82331195; EvalErr[0]PerSample = 0.28393750; TotalTime = 55.8048s; SamplesPerSecond = 573.4
Finished Epoch[ 5 of 30]: [Training Set] TrainLossPerSample = 0.81178772; EvalErrPerSample = 0.28014964; AvgLearningRatePerSample = 0.00015625; EpochTime=87.1605
Switching dropout rate to 0.5.
Starting Epoch 6: learning rate per sample = 0.000156  effective momentum = 0.900000  momentum as time constant = 607.4 samples
starting epoch 5 at record count 249920, and file position 49920
already there from last epoch

Starting minibatch loop.
RandomOrdering: 9976 retries for 49984 elements (20.0%) to ensure window condition
RandomOrdering: recached sequence for seed 5: 23679, 21503, ...
 Epoch[ 6 of 30]-Minibatch[   1- 500, 64.02%]: SamplesSeen = 32000; TrainLossPerSample =  1.07542212; EvalErr[0]PerSample = 0.36756250; TotalTime = 56.0877s; SamplesPerSecond = 570.5
Finished Epoch[ 6 of 30]: [Training Set] TrainLossPerSample = 1.0569457; EvalErrPerSample = 0.36003521; AvgLearningRatePerSample = 0.00015625; EpochTime=88.7295
Starting Epoch 7: learning rate per sample = 0.000156  effective momentum = 0.900000  momentum as time constant = 607.4 samples
starting epoch 6 at record count 299904, and file position 49904
already there from last epoch

Starting minibatch loop.
RandomOrdering: 9965 retries for 49984 elements (19.9%) to ensure window condition
RandomOrdering: recached sequence for seed 6: 14851, 3424, ...
 Epoch[ 7 of 30]-Minibatch[   1- 500, 64.02%]: SamplesSeen = 32000; TrainLossPerSample =  0.97511536; EvalErr[0]PerSample = 0.32893750; TotalTime = 60.0215s; SamplesPerSecond = 533.1
Finished Epoch[ 7 of 30]: [Training Set] TrainLossPerSample = 0.97243714; EvalErrPerSample = 0.32660452; AvgLearningRatePerSample = 0.00015625; EpochTime=93.8356
Starting Epoch 8: learning rate per sample = 0.000156  effective momentum = 0.900000  momentum as time constant = 607.4 samples
starting epoch 7 at record count 349888, and file position 49888
already there from last epoch

Starting minibatch loop.
RandomOrdering: 9867 retries for 49984 elements (19.7%) to ensure window condition
RandomOrdering: recached sequence for seed 7: 16849, 2766, ...
 Epoch[ 8 of 30]-Minibatch[   1- 500, 64.02%]: SamplesSeen = 32000; TrainLossPerSample =  0.93696350; EvalErr[0]PerSample = 0.31400000; TotalTime = 60.1865s; SamplesPerSecond = 531.7
Finished Epoch[ 8 of 30]: [Training Set] TrainLossPerSample = 0.92582047; EvalErrPerSample = 0.31025928; AvgLearningRatePerSample = 0.00015625; EpochTime=94.0078
Starting Epoch 9: learning rate per sample = 0.000156  effective momentum = 0.900000  momentum as time constant = 607.4 samples
starting epoch 8 at record count 399872, and file position 49872
already there from last epoch

Starting minibatch loop.
RandomOrdering: 9935 retries for 49984 elements (19.9%) to ensure window condition
RandomOrdering: recached sequence for seed 8: 938, 11770, ...
 Epoch[ 9 of 30]-Minibatch[   1- 500, 64.02%]: SamplesSeen = 32000; TrainLossPerSample =  0.88571924; EvalErr[0]PerSample = 0.29534375; TotalTime = 61.1568s; SamplesPerSecond = 523.2
Finished Epoch[ 9 of 30]: [Training Set] TrainLossPerSample = 0.89104927; EvalErrPerSample = 0.29819542; AvgLearningRatePerSample = 0.00015625; EpochTime=95.7176
Starting Epoch 10: learning rate per sample = 0.000156  effective momentum = 0.900000  momentum as time constant = 607.4 samples
starting epoch 9 at record count 449856, and file position 49856
already there from last epoch

Starting minibatch loop.
RandomOrdering: 9888 retries for 49984 elements (19.8%) to ensure window condition
RandomOrdering: recached sequence for seed 9: 10600, 4426, ...
 Epoch[10 of 30]-Minibatch[   1- 500, 64.02%]: SamplesSeen = 32000; TrainLossPerSample =  0.85505237; EvalErr[0]PerSample = 0.28478125; TotalTime = 60.9339s; SamplesPerSecond = 525.2
Finished Epoch[10 of 30]: [Training Set] TrainLossPerSample = 0.8553952; EvalErrPerSample = 0.28411093; AvgLearningRatePerSample = 0.00015625; EpochTime=95.5035
Starting Epoch 11: learning rate per sample = 0.000047  effective momentum = 0.900000  momentum as time constant = 607.4 samples
starting epoch 10 at record count 499840, and file position 49840
already there from last epoch

Starting minibatch loop.
RandomOrdering: 9883 retries for 49984 elements (19.8%) to ensure window condition
RandomOrdering: recached sequence for seed 10: 19100, 7348, ...
 Epoch[11 of 30]-Minibatch[   1- 500, 64.02%]: SamplesSeen = 32000; TrainLossPerSample =  0.72444861; EvalErr[0]PerSample = 0.23906250; TotalTime = 62.3162s; SamplesPerSecond = 513.5
Finished Epoch[11 of 30]: [Training Set] TrainLossPerSample = 0.71067673; EvalErrPerSample = 0.23363477; AvgLearningRatePerSample = 4.6875e-005; EpochTime=97.4373
Starting Epoch 12: learning rate per sample = 0.000047  effective momentum = 0.900000  momentum as time constant = 607.4 samples
starting epoch 11 at record count 549824, and file position 49824
already there from last epoch

Starting minibatch loop.
RandomOrdering: 9832 retries for 49984 elements (19.7%) to ensure window condition
RandomOrdering: recached sequence for seed 11: 3190, 5055, ...
 Epoch[12 of 30]-Minibatch[   1- 500, 64.02%]: SamplesSeen = 32000; TrainLossPerSample =  0.68085730; EvalErr[0]PerSample = 0.22637500; TotalTime = 61.1765s; SamplesPerSecond = 523.1
Finished Epoch[12 of 30]: [Training Set] TrainLossPerSample = 0.67801023; EvalErrPerSample = 0.22483195; AvgLearningRatePerSample = 4.6875e-005; EpochTime=95.6381
Starting Epoch 13: learning rate per sample = 0.000047  effective momentum = 0.900000  momentum as time constant = 607.4 samples
starting epoch 12 at record count 599808, and file position 49808
already there from last epoch

Starting minibatch loop.
RandomOrdering: 9611 retries for 49984 elements (19.2%) to ensure window condition
RandomOrdering: recached sequence for seed 12: 2246, 15193, ...
 Epoch[13 of 30]-Minibatch[   1- 500, 64.02%]: SamplesSeen = 32000; TrainLossPerSample =  0.66249323; EvalErr[0]PerSample = 0.21806250; TotalTime = 60.3991s; SamplesPerSecond = 529.8
Finished Epoch[13 of 30]: [Training Set] TrainLossPerSample = 0.66005397; EvalErrPerSample = 0.21678938; AvgLearningRatePerSample = 4.6875e-005; EpochTime=94.8426
Starting Epoch 14: learning rate per sample = 0.000047  effective momentum = 0.900000  momentum as time constant = 607.4 samples
starting epoch 13 at record count 649792, and file position 49792
already there from last epoch

Starting minibatch loop.
RandomOrdering: 9916 retries for 49984 elements (19.8%) to ensure window condition
RandomOrdering: recached sequence for seed 13: 21351, 4729, ...
 Epoch[14 of 30]-Minibatch[   1- 500, 64.02%]: SamplesSeen = 32000; TrainLossPerSample =  0.63945258; EvalErr[0]PerSample = 0.21068750; TotalTime = 63.1367s; SamplesPerSecond = 506.8
Finished Epoch[14 of 30]: [Training Set] TrainLossPerSample = 0.64283323; EvalErrPerSample = 0.21250801; AvgLearningRatePerSample = 4.6875e-005; EpochTime=99.9905
Starting Epoch 15: learning rate per sample = 0.000047  effective momentum = 0.900000  momentum as time constant = 607.4 samples
starting epoch 14 at record count 699776, and file position 49776
already there from last epoch

Starting minibatch loop.
RandomOrdering: 9806 retries for 49984 elements (19.6%) to ensure window condition
RandomOrdering: recached sequence for seed 14: 5441, 4486, ...
 Epoch[15 of 30]-Minibatch[   1- 500, 64.02%]: SamplesSeen = 32000; TrainLossPerSample =  0.63832739; EvalErr[0]PerSample = 0.20843750; TotalTime = 65.5967s; SamplesPerSecond = 487.8
Finished Epoch[15 of 30]: [Training Set] TrainLossPerSample = 0.63823402; EvalErrPerSample = 0.20898688; AvgLearningRatePerSample = 4.6875e-005; EpochTime=102.441
Starting Epoch 16: learning rate per sample = 0.000047  effective momentum = 0.900000  momentum as time constant = 607.4 samples
starting epoch 15 at record count 749760, and file position 49760
already there from last epoch

Starting minibatch loop.
RandomOrdering: 10099 retries for 49984 elements (20.2%) to ensure window condition
RandomOrdering: recached sequence for seed 15: 8243, 11419, ...
 Epoch[16 of 30]-Minibatch[   1- 500, 64.02%]: SamplesSeen = 32000; TrainLossPerSample =  0.63027692; EvalErr[0]PerSample = 0.20584375; TotalTime = 61.3233s; SamplesPerSecond = 521.8
Finished Epoch[16 of 30]: [Training Set] TrainLossPerSample = 0.62585586; EvalErrPerSample = 0.20480554; AvgLearningRatePerSample = 4.6875e-005; EpochTime=95.7758
Starting Epoch 17: learning rate per sample = 0.000047  effective momentum = 0.900000  momentum as time constant = 607.4 samples
starting epoch 16 at record count 799744, and file position 49744
already there from last epoch

Starting minibatch loop.
RandomOrdering: 9898 retries for 49984 elements (19.8%) to ensure window condition
RandomOrdering: recached sequence for seed 16: 10855, 1361, ...
 Epoch[17 of 30]-Minibatch[   1- 500, 64.02%]: SamplesSeen = 32000; TrainLossPerSample =  0.61816498; EvalErr[0]PerSample = 0.20075000; TotalTime = 60.3761s; SamplesPerSecond = 530.0
Finished Epoch[17 of 30]: [Training Set] TrainLossPerSample = 0.61884153; EvalErrPerSample = 0.20176457; AvgLearningRatePerSample = 4.6875e-005; EpochTime=94.3146
Starting Epoch 18: learning rate per sample = 0.000047  effective momentum = 0.900000  momentum as time constant = 607.4 samples
starting epoch 17 at record count 849728, and file position 49728
already there from last epoch

Starting minibatch loop.
RandomOrdering: 9731 retries for 49984 elements (19.5%) to ensure window condition
RandomOrdering: recached sequence for seed 17: 8, 10814, ...
 Epoch[18 of 30]-Minibatch[   1- 500, 64.02%]: SamplesSeen = 32000; TrainLossPerSample =  0.61204218; EvalErr[0]PerSample = 0.20112500; TotalTime = 61.1974s; SamplesPerSecond = 522.9
Finished Epoch[18 of 30]: [Training Set] TrainLossPerSample = 0.61181009; EvalErrPerSample = 0.20048416; AvgLearningRatePerSample = 4.6875e-005; EpochTime=95.6448
Starting Epoch 19: learning rate per sample = 0.000047  effective momentum = 0.900000  momentum as time constant = 607.4 samples
starting epoch 18 at record count 899712, and file position 49712
already there from last epoch

Starting minibatch loop.
RandomOrdering: 9824 retries for 49984 elements (19.7%) to ensure window condition
RandomOrdering: recached sequence for seed 18: 8997, 5961, ...
 Epoch[19 of 30]-Minibatch[   1- 500, 64.02%]: SamplesSeen = 32000; TrainLossPerSample =  0.60144781; EvalErr[0]PerSample = 0.19650000; TotalTime = 60.3804s; SamplesPerSecond = 530.0
Finished Epoch[19 of 30]: [Training Set] TrainLossPerSample = 0.60449916; EvalErrPerSample = 0.19830346; AvgLearningRatePerSample = 4.6875e-005; EpochTime=94.2939
Starting Epoch 20: learning rate per sample = 0.000047  effective momentum = 0.900000  momentum as time constant = 607.4 samples
starting epoch 19 at record count 949696, and file position 49696
already there from last epoch

Starting minibatch loop.
RandomOrdering: 9760 retries for 49984 elements (19.5%) to ensure window condition
RandomOrdering: recached sequence for seed 19: 4920, 1109, ...
 Epoch[20 of 30]-Minibatch[   1- 500, 64.02%]: SamplesSeen = 32000; TrainLossPerSample =  0.59251532; EvalErr[0]PerSample = 0.19287500; TotalTime = 61.2158s; SamplesPerSecond = 522.7
Finished Epoch[20 of 30]: [Training Set] TrainLossPerSample = 0.59491181; EvalErrPerSample = 0.19410211; AvgLearningRatePerSample = 4.6875e-005; EpochTime=95.6624
Starting Epoch 21: learning rate per sample = 0.000016  effective momentum = 0.990000  momentum as time constant = 6368.0 samples
starting epoch 20 at record count 999680, and file position 49680
already there from last epoch

Starting minibatch loop.
RandomOrdering: 9833 retries for 49984 elements (19.7%) to ensure window condition
RandomOrdering: recached sequence for seed 20: 24822, 13807, ...
 Epoch[21 of 30]-Minibatch[   1- 500, 64.02%]: SamplesSeen = 32000; TrainLossPerSample =  0.76228503; EvalErr[0]PerSample = 0.25471875; TotalTime = 60.4455s; SamplesPerSecond = 529.4
Finished Epoch[21 of 30]: [Training Set] TrainLossPerSample = 0.69931042; EvalErrPerSample = 0.23153409; AvgLearningRatePerSample = 1.5625001e-005; EpochTime=94.8921
Starting Epoch 22: learning rate per sample = 0.000016  effective momentum = 0.990000  momentum as time constant = 6368.0 samples
starting epoch 21 at record count 1049664, and file position 49664
already there from last epoch

Starting minibatch loop.
RandomOrdering: 9782 retries for 49984 elements (19.6%) to ensure window condition
RandomOrdering: recached sequence for seed 21: 11249, 13793, ...
 Epoch[22 of 30]-Minibatch[   1- 500, 64.02%]: SamplesSeen = 32000; TrainLossPerSample =  0.55517798; EvalErr[0]PerSample = 0.18190625; TotalTime = 60.3673s; SamplesPerSecond = 530.1
Finished Epoch[22 of 30]: [Training Set] TrainLossPerSample = 0.54519671; EvalErrPerSample = 0.17827705; AvgLearningRatePerSample = 1.5625001e-005; EpochTime=94.3621
Starting Epoch 23: learning rate per sample = 0.000016  effective momentum = 0.990000  momentum as time constant = 6368.0 samples
starting epoch 22 at record count 1099648, and file position 49648
already there from last epoch

Starting minibatch loop.
RandomOrdering: 9734 retries for 49984 elements (19.5%) to ensure window condition
RandomOrdering: recached sequence for seed 22: 20330, 17965, ...
 Epoch[23 of 30]-Minibatch[   1- 500, 64.02%]: SamplesSeen = 32000; TrainLossPerSample =  0.51714801; EvalErr[0]PerSample = 0.17000000; TotalTime = 61.2311s; SamplesPerSecond = 522.6
Finished Epoch[23 of 30]: [Training Set] TrainLossPerSample = 0.51686686; EvalErrPerSample = 0.16821383; AvgLearningRatePerSample = 1.5625001e-005; EpochTime=95.6817
Starting Epoch 24: learning rate per sample = 0.000016  effective momentum = 0.990000  momentum as time constant = 6368.0 samples
starting epoch 23 at record count 1149632, and file position 49632
already there from last epoch

Starting minibatch loop.
RandomOrdering: 9778 retries for 49984 elements (19.6%) to ensure window condition
RandomOrdering: recached sequence for seed 23: 16023, 4803, ...
 Epoch[24 of 30]-Minibatch[   1- 500, 64.02%]: SamplesSeen = 32000; TrainLossPerSample =  0.51147125; EvalErr[0]PerSample = 0.16443750; TotalTime = 61.6792s; SamplesPerSecond = 518.8
Finished Epoch[24 of 30]: [Training Set] TrainLossPerSample = 0.50843734; EvalErrPerSample = 0.16317222; AvgLearningRatePerSample = 1.5625001e-005; EpochTime=97.1277
Starting Epoch 25: learning rate per sample = 0.000016  effective momentum = 0.990000  momentum as time constant = 6368.0 samples
starting epoch 24 at record count 1199616, and file position 49616
already there from last epoch

Starting minibatch loop.
RandomOrdering: 9845 retries for 49984 elements (19.7%) to ensure window condition
RandomOrdering: recached sequence for seed 24: 5725, 4504, ...
 Epoch[25 of 30]-Minibatch[   1- 500, 64.02%]: SamplesSeen = 32000; TrainLossPerSample =  0.50563559; EvalErr[0]PerSample = 0.16443750; TotalTime = 63.0981s; SamplesPerSecond = 507.1
Finished Epoch[25 of 30]: [Training Set] TrainLossPerSample = 0.50288838; EvalErrPerSample = 0.16245198; AvgLearningRatePerSample = 1.5625001e-005; EpochTime=99.6339
Starting Epoch 26: learning rate per sample = 0.000016  effective momentum = 0.990000  momentum as time constant = 6368.0 samples
starting epoch 25 at record count 1249600, and file position 49600
already there from last epoch

Starting minibatch loop.
RandomOrdering: 9907 retries for 49984 elements (19.8%) to ensure window condition
RandomOrdering: recached sequence for seed 25: 142, 14350, ...
 Epoch[26 of 30]-Minibatch[   1- 500, 64.02%]: SamplesSeen = 32000; TrainLossPerSample =  0.49227759; EvalErr[0]PerSample = 0.15862500; TotalTime = 69.3028s; SamplesPerSecond = 461.7
Finished Epoch[26 of 30]: [Training Set] TrainLossPerSample = 0.49257073; EvalErrPerSample = 0.15859075; AvgLearningRatePerSample = 1.5625001e-005; EpochTime=109.706
Starting Epoch 27: learning rate per sample = 0.000016  effective momentum = 0.990000  momentum as time constant = 6368.0 samples
starting epoch 26 at record count 1299584, and file position 49584
already there from last epoch

Starting minibatch loop.
RandomOrdering: 9854 retries for 49984 elements (19.7%) to ensure window condition
RandomOrdering: recached sequence for seed 26: 6671, 1505, ...
 Epoch[27 of 30]-Minibatch[   1- 500, 64.02%]: SamplesSeen = 32000; TrainLossPerSample =  0.49133130; EvalErr[0]PerSample = 0.15878125; TotalTime = 71.8662s; SamplesPerSecond = 445.3
Finished Epoch[27 of 30]: [Training Set] TrainLossPerSample = 0.48960817; EvalErrPerSample = 0.15805058; AvgLearningRatePerSample = 1.5625001e-005; EpochTime=112.249
Starting Epoch 28: learning rate per sample = 0.000016  effective momentum = 0.990000  momentum as time constant = 6368.0 samples
starting epoch 27 at record count 1349568, and file position 49568
already there from last epoch

Starting minibatch loop.
RandomOrdering: 9981 retries for 49984 elements (20.0%) to ensure window condition
RandomOrdering: recached sequence for seed 27: 10220, 7497, ...
 Epoch[28 of 30]-Minibatch[   1- 500, 64.02%]: SamplesSeen = 32000; TrainLossPerSample =  0.48690274; EvalErr[0]PerSample = 0.15915625; TotalTime = 68.8961s; SamplesPerSecond = 464.5
Finished Epoch[28 of 30]: [Training Set] TrainLossPerSample = 0.48738438; EvalErrPerSample = 0.15803057; AvgLearningRatePerSample = 1.5625001e-005; EpochTime=107.621
Starting Epoch 29: learning rate per sample = 0.000016  effective momentum = 0.990000  momentum as time constant = 6368.0 samples
starting epoch 28 at record count 1399552, and file position 49552
already there from last epoch

Starting minibatch loop.
RandomOrdering: 10093 retries for 49984 elements (20.2%) to ensure window condition
RandomOrdering: recached sequence for seed 28: 7794, 20194, ...
 Epoch[29 of 30]-Minibatch[   1- 500, 64.02%]: SamplesSeen = 32000; TrainLossPerSample =  0.48750171; EvalErr[0]PerSample = 0.15856250; TotalTime = 66.5326s; SamplesPerSecond = 481.0
Finished Epoch[29 of 30]: [Training Set] TrainLossPerSample = 0.48358461; EvalErrPerSample = 0.15596992; AvgLearningRatePerSample = 1.5625001e-005; EpochTime=105.242
Starting Epoch 30: learning rate per sample = 0.000016  effective momentum = 0.990000  momentum as time constant = 6368.0 samples
starting epoch 29 at record count 1449536, and file position 49536
already there from last epoch

Starting minibatch loop.
RandomOrdering: 9952 retries for 49984 elements (19.9%) to ensure window condition
RandomOrdering: recached sequence for seed 29: 11982, 15342, ...
 Epoch[30 of 30]-Minibatch[   1- 500, 64.02%]: SamplesSeen = 32000; TrainLossPerSample =  0.49641956; EvalErr[0]PerSample = 0.16156250; TotalTime = 69.7728s; SamplesPerSecond = 458.6
Finished Epoch[30 of 30]: [Training Set] TrainLossPerSample = 0.49062848; EvalErrPerSample = 0.15925096; AvgLearningRatePerSample = 1.5625001e-005; EpochTime=110.162
CNTKCommandTrainEnd: Train
Reading UCI file ./Test.txt
WARNING: trying to use cuDNN on unsupported platform. It is safe to ignore the warning if it's produced during model editing command.
WARNING: trying to use cuDNN on unsupported platform. It is safe to ignore the warning if it's produced during model editing command.
WARNING: trying to use cuDNN on unsupported platform. It is safe to ignore the warning if it's produced during model editing command.
WARNING: trying to use cuDNN on unsupported platform. It is safe to ignore the warning if it's produced during model editing command.
WARNING: trying to use cuDNN on unsupported platform. It is safe to ignore the warning if it's produced during model editing command.
WARNING: trying to use cuDNN on unsupported platform. It is safe to ignore the warning if it's produced during model editing command.

Post-processing network...

3 roots:
    Err = ErrorPrediction
    CE = CrossEntropyWithSoftmax
    OutputNodes.z = Plus
FormNestedNetwork: WARNING: Was called twice for Err ErrorPrediction operation
FormNestedNetwork: WARNING: Was called twice for CE CrossEntropyWithSoftmax operation
FormNestedNetwork: WARNING: Was called twice for OutputNodes.z Plus operation

Validating network. 34 nodes to process in pass 1.

Validating --> labels = InputValue -> [10 x *]
Validating --> OutputNodes.W = LearnableParameter -> [10 x 64]
Validating --> h1.W = LearnableParameter -> [64 x 576]
Validating --> conv3_act.W = LearnableParameter -> [64 x 800]
Validating --> conv2_act.W = LearnableParameter -> [32 x 800]
Validating --> conv1_act.W = LearnableParameter -> [32 x 75]
Validating --> features = InputValue -> [32 x 32 x 3 x *]
Validating --> featOffs = LearnableParameter -> [1 x 1]
Validating --> featScaled = Minus(features[32 x 32 x 3 x * {W=32, H=3, C=32}], featOffs[1 x 1]) -> [32 x 32 x 3 x *]
Validating --> conv1_act.c = Convolution(conv1_act.W[32 x 75], featScaled[32 x 32 x 3 x * {W=32, H=3, C=32}]) -> [32 x 32 x 32 x *]
Validating --> conv1_act.b = LearnableParameter -> [1 x 1 x 32]
Validating --> conv1_act.p = Plus(conv1_act.c[32 x 32 x 32 x * {W=32, H=32, C=32}], conv1_act.b[1 x 1 x 32]) -> [32 x 32 x 32 x *]
Validating --> conv1_act.y = RectifiedLinear(conv1_act.p[32 x 32 x 32 x * {W=32, H=32, C=32}]) -> [32 x 32 x 32 x *]
Validating --> pool1 = MaxPooling(conv1_act.y[32 x 32 x 32 x * {W=32, H=32, C=32}]) -> [15 x 15 x 32 x *]
Validating --> conv2_act.c = Convolution(conv2_act.W[32 x 800], pool1[15 x 15 x 32 x * {W=15, H=32, C=15}]) -> [15 x 15 x 32 x *]
Validating --> conv2_act.b = LearnableParameter -> [1 x 1 x 32]
Validating --> conv2_act.p = Plus(conv2_act.c[15 x 15 x 32 x * {W=15, H=32, C=15}], conv2_act.b[1 x 1 x 32]) -> [15 x 15 x 32 x *]
Validating --> conv2_act.y = RectifiedLinear(conv2_act.p[15 x 15 x 32 x * {W=15, H=32, C=15}]) -> [15 x 15 x 32 x *]
Validating --> pool2 = MaxPooling(conv2_act.y[15 x 15 x 32 x * {W=15, H=32, C=15}]) -> [7 x 7 x 32 x *]
Validating --> conv3_act.c = Convolution(conv3_act.W[64 x 800], pool2[7 x 7 x 32 x * {W=7, H=32, C=7}]) -> [7 x 7 x 64 x *]
Validating --> conv3_act.b = LearnableParameter -> [1 x 1 x 64]
Validating --> conv3_act.p = Plus(conv3_act.c[7 x 7 x 64 x * {W=7, H=64, C=7}], conv3_act.b[1 x 1 x 64]) -> [7 x 7 x 64 x *]
Validating --> conv3_act.y = RectifiedLinear(conv3_act.p[7 x 7 x 64 x * {W=7, H=64, C=7}]) -> [7 x 7 x 64 x *]
Validating --> pool3 = MaxPooling(conv3_act.y[7 x 7 x 64 x * {W=7, H=64, C=7}]) -> [3 x 3 x 64 x *]
Validating --> h1.t = Times(h1.W[64 x 576], pool3[3 x 3 x 64 x * {W=3, H=64, C=3}]) -> [64 x *]
Validating --> h1.b = LearnableParameter -> [64 x 1]
Validating --> h1.z = Plus(h1.t[64 x *], h1.b[64 x 1]) -> [64 x 1 x *]
Validating --> h1.y = RectifiedLinear(h1.z[64 x 1 x *]) -> [64 x 1 x *]
Validating --> h1_d = Dropout(h1.y[64 x 1 x *]) -> [64 x 1 x *]
Validating --> OutputNodes.t = Times(OutputNodes.W[10 x 64], h1_d[64 x 1 x *]) -> [10 x *]
Validating --> OutputNodes.b = LearnableParameter -> [10]
Validating --> OutputNodes.z = Plus(OutputNodes.t[10 x *], OutputNodes.b[10]) -> [10 x *]
Validating --> Err = ErrorPrediction(labels[10 x *], OutputNodes.z[10 x *]) -> [1]
Validating --> CE = CrossEntropyWithSoftmax(labels[10 x *], OutputNodes.z[10 x *]) -> [1]

Validating network. 21 nodes to process in pass 2.

Validating --> labels = InputValue -> [10 x *]
Validating --> OutputNodes.W = LearnableParameter -> [10 x 64]
Validating --> h1.W = LearnableParameter -> [64 x 576]
Validating --> conv3_act.W = LearnableParameter -> [64 x 800]
Validating --> conv2_act.W = LearnableParameter -> [32 x 800]
Validating --> conv1_act.W = LearnableParameter -> [32 x 75]
Validating --> features = InputValue -> [32 x 32 x 3 x *]
Validating --> featOffs = LearnableParameter -> [1 x 1]
Validating --> featScaled = Minus(features[32 x 32 x 3 x * {W=32, H=3, C=32}], featOffs[1 x 1]) -> [32 x 32 x 3 x *]
Validating --> conv1_act.c = Convolution(conv1_act.W[32 x 75], featScaled[32 x 32 x 3 x * {W=32, H=3, C=32}]) -> [32 x 32 x 32 x *]
Validating --> conv1_act.b = LearnableParameter -> [1 x 1 x 32]
Validating --> conv1_act.p = Plus(conv1_act.c[32 x 32 x 32 x * {W=32, H=32, C=32}], conv1_act.b[1 x 1 x 32]) -> [32 x 32 x 32 x *]
Validating --> conv1_act.y = RectifiedLinear(conv1_act.p[32 x 32 x 32 x * {W=32, H=32, C=32}]) -> [32 x 32 x 32 x *]
Validating --> pool1 = MaxPooling(conv1_act.y[32 x 32 x 32 x * {W=32, H=32, C=32}]) -> [15 x 15 x 32 x *]
Validating --> conv2_act.c = Convolution(conv2_act.W[32 x 800], pool1[15 x 15 x 32 x * {W=15, H=32, C=15}]) -> [15 x 15 x 32 x *]
Validating --> conv2_act.b = LearnableParameter -> [1 x 1 x 32]
Validating --> conv2_act.p = Plus(conv2_act.c[15 x 15 x 32 x * {W=15, H=32, C=15}], conv2_act.b[1 x 1 x 32]) -> [15 x 15 x 32 x *]
Validating --> conv2_act.y = RectifiedLinear(conv2_act.p[15 x 15 x 32 x * {W=15, H=32, C=15}]) -> [15 x 15 x 32 x *]
Validating --> pool2 = MaxPooling(conv2_act.y[15 x 15 x 32 x * {W=15, H=32, C=15}]) -> [7 x 7 x 32 x *]
Validating --> conv3_act.c = Convolution(conv3_act.W[64 x 800], pool2[7 x 7 x 32 x * {W=7, H=32, C=7}]) -> [7 x 7 x 64 x *]
Validating --> conv3_act.b = LearnableParameter -> [1 x 1 x 64]
Validating --> conv3_act.p = Plus(conv3_act.c[7 x 7 x 64 x * {W=7, H=64, C=7}], conv3_act.b[1 x 1 x 64]) -> [7 x 7 x 64 x *]
Validating --> conv3_act.y = RectifiedLinear(conv3_act.p[7 x 7 x 64 x * {W=7, H=64, C=7}]) -> [7 x 7 x 64 x *]
Validating --> pool3 = MaxPooling(conv3_act.y[7 x 7 x 64 x * {W=7, H=64, C=7}]) -> [3 x 3 x 64 x *]
Validating --> h1.t = Times(h1.W[64 x 576], pool3[3 x 3 x 64 x * {W=3, H=64, C=3}]) -> [64 x *]
Validating --> h1.b = LearnableParameter -> [64 x 1]
Validating --> h1.z = Plus(h1.t[64 x *], h1.b[64 x 1]) -> [64 x 1 x *]
Validating --> h1.y = RectifiedLinear(h1.z[64 x 1 x *]) -> [64 x 1 x *]
Validating --> h1_d = Dropout(h1.y[64 x 1 x *]) -> [64 x 1 x *]
Validating --> OutputNodes.t = Times(OutputNodes.W[10 x 64], h1_d[64 x 1 x *]) -> [10 x *]
Validating --> OutputNodes.b = LearnableParameter -> [10]
Validating --> OutputNodes.z = Plus(OutputNodes.t[10 x *], OutputNodes.b[10]) -> [10 x *]
Validating --> Err = ErrorPrediction(labels[10 x *], OutputNodes.z[10 x *]) -> [1]
Validating --> CE = CrossEntropyWithSoftmax(labels[10 x *], OutputNodes.z[10 x *]) -> [1]

Validating network, final pass.

Validating --> labels = InputValue -> [10 x *]
Validating --> OutputNodes.W = LearnableParameter -> [10 x 64]
Validating --> h1.W = LearnableParameter -> [64 x 576]
Validating --> conv3_act.W = LearnableParameter -> [64 x 800]
Validating --> conv2_act.W = LearnableParameter -> [32 x 800]
Validating --> conv1_act.W = LearnableParameter -> [32 x 75]
Validating --> features = InputValue -> [32 x 32 x 3 x *]
Validating --> featOffs = LearnableParameter -> [1 x 1]
Validating --> featScaled = Minus(features[32 x 32 x 3 x * {W=32, H=3, C=32}], featOffs[1 x 1]) -> [32 x 32 x 3 x *]
Validating --> conv1_act.c = Convolution(conv1_act.W[32 x 75], featScaled[32 x 32 x 3 x * {W=32, H=3, C=32}]) -> [32 x 32 x 32 x *]
Validating --> conv1_act.b = LearnableParameter -> [1 x 1 x 32]
Validating --> conv1_act.p = Plus(conv1_act.c[32 x 32 x 32 x * {W=32, H=32, C=32}], conv1_act.b[1 x 1 x 32]) -> [32 x 32 x 32 x *]
Validating --> conv1_act.y = RectifiedLinear(conv1_act.p[32 x 32 x 32 x * {W=32, H=32, C=32}]) -> [32 x 32 x 32 x *]
Validating --> pool1 = MaxPooling(conv1_act.y[32 x 32 x 32 x * {W=32, H=32, C=32}]) -> [15 x 15 x 32 x *]
Validating --> conv2_act.c = Convolution(conv2_act.W[32 x 800], pool1[15 x 15 x 32 x * {W=15, H=32, C=15}]) -> [15 x 15 x 32 x *]
Validating --> conv2_act.b = LearnableParameter -> [1 x 1 x 32]
Validating --> conv2_act.p = Plus(conv2_act.c[15 x 15 x 32 x * {W=15, H=32, C=15}], conv2_act.b[1 x 1 x 32]) -> [15 x 15 x 32 x *]
Validating --> conv2_act.y = RectifiedLinear(conv2_act.p[15 x 15 x 32 x * {W=15, H=32, C=15}]) -> [15 x 15 x 32 x *]
Validating --> pool2 = MaxPooling(conv2_act.y[15 x 15 x 32 x * {W=15, H=32, C=15}]) -> [7 x 7 x 32 x *]
Validating --> conv3_act.c = Convolution(conv3_act.W[64 x 800], pool2[7 x 7 x 32 x * {W=7, H=32, C=7}]) -> [7 x 7 x 64 x *]
Validating --> conv3_act.b = LearnableParameter -> [1 x 1 x 64]
Validating --> conv3_act.p = Plus(conv3_act.c[7 x 7 x 64 x * {W=7, H=64, C=7}], conv3_act.b[1 x 1 x 64]) -> [7 x 7 x 64 x *]
Validating --> conv3_act.y = RectifiedLinear(conv3_act.p[7 x 7 x 64 x * {W=7, H=64, C=7}]) -> [7 x 7 x 64 x *]
Validating --> pool3 = MaxPooling(conv3_act.y[7 x 7 x 64 x * {W=7, H=64, C=7}]) -> [3 x 3 x 64 x *]
Validating --> h1.t = Times(h1.W[64 x 576], pool3[3 x 3 x 64 x * {W=3, H=64, C=3}]) -> [64 x *]
Validating --> h1.b = LearnableParameter -> [64 x 1]
Validating --> h1.z = Plus(h1.t[64 x *], h1.b[64 x 1]) -> [64 x 1 x *]
Validating --> h1.y = RectifiedLinear(h1.z[64 x 1 x *]) -> [64 x 1 x *]
Validating --> h1_d = Dropout(h1.y[64 x 1 x *]) -> [64 x 1 x *]
Validating --> OutputNodes.t = Times(OutputNodes.W[10 x 64], h1_d[64 x 1 x *]) -> [10 x *]
Validating --> OutputNodes.b = LearnableParameter -> [10]
Validating --> OutputNodes.z = Plus(OutputNodes.t[10 x *], OutputNodes.b[10]) -> [10 x *]
Validating --> Err = ErrorPrediction(labels[10 x *], OutputNodes.z[10 x *]) -> [1]
Validating --> CE = CrossEntropyWithSoftmax(labels[10 x *], OutputNodes.z[10 x *]) -> [1]

13 out of 34 nodes do not share the minibatch layout with the input data.

Post-processing network complete.
evalNodeNames are not specified, using all the default evalnodes and training criterion nodes.

Allocating matrices for forward and/or backward propagation.
UCIFastReader: Starting at epoch 0, counting lines to determine record count...
 10000 records found.
starting epoch 0 at record count 0, and file position 0
already there from last epoch
Minibatch[1-500]: Samples Seen = 8000    Err: ErrorPrediction/Sample = 0.201125    CE: CrossEntropyWithSoftmax/Sample = 0.63075029    
Minibatch[501-625]: Samples Seen = 2000    Err: ErrorPrediction/Sample = 0.2085    CE: CrossEntropyWithSoftmax/Sample = 0.60663918    
Final Results: Minibatch[1-625]: Samples Seen = 10000    Err: ErrorPrediction/Sample = 0.2026    CE: CrossEntropyWithSoftmax/Sample = 0.62592807    Perplexity = 1.8699806    
COMPLETED

Ok, good, we are making some progress :) I know that error, it's due to one of the recent changes. We have not updated all of our samples (image in particular) to reflect that change. What you need to do is this:

Add the following to Macro.ndl:

DNNImageReLULayer(inW, inH, inC, outDim, x, wScale, bValue) = [
    W = ImageParameter(outDim, inW, inH, inC, init = Gaussian, initValueScale = wScale, imageLayout="legacy")
    b = LearnableParameter(outDim, 1,         init = fixedValue, value = bValue) 
    t = Times(W, x)
    z = Plus(t, b)
    y = RectifiedLinear(z)
]

And in 01_Convolution.ndl replace line which has DNNReLULayer with this:

    h1 = DNNImageReLULayer(3, 3, cMap3, hiddenDim, pool3, fc1WScale, fc1BValue)

Then it should work fine.

Hi Alexey

I faced to same issue. Fortunately, due to your advice, I could remove the following error message.

EXCEPTION occurred: h1.t Times operation: Left [64 x 576 {1,64}] and right [3 x 3 x 64 {1,3,9}] operands' shapes are not compatible.

However, I face to the following error message after added DNNImageReLULayer to *.ndl.

-------------------------------------------------------------------
Build info: 

        Built time: Mar  5 2016 01:31:44
        Last modified date: Fri Mar  4 23:52:36 2016
        Build type: Release
        Build target: GPU
        With 1bit-SGD: no
        CUDA_PATH: C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v7.5
        CUB_PATH: D:\Works\Lib\NVIDIA\cub-1.4.1
        CUDNN_PATH: D:\Works\Lib\NVIDIA\cudnn-4.0\cuda
        Build Branch: master
        Build SHA1: 79f5349363b4d6e3f6cd8f4c90af6ba290c0b270 (modified)
        Built by TAKUYA on Takuya-PC
        Build Path: D:\Works\Lib\Microsoft\CNTK\CNTK\Source\CNTK\
-------------------------------------------------------------------
Redirecting stderr to file ./Output/01_Conv_Train_Test.log
-------------------------------------------------------------------
Build info: 

        Built time: Mar  5 2016 01:31:44
        Last modified date: Fri Mar  4 23:52:36 2016
        Build type: Release
        Build target: GPU
        With 1bit-SGD: no
        CUDA_PATH: C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v7.5
        CUB_PATH: D:\Works\Lib\NVIDIA\cub-1.4.1
        CUDNN_PATH: D:\Works\Lib\NVIDIA\cudnn-4.0\cuda
        Build Branch: master
        Build SHA1: 79f5349363b4d6e3f6cd8f4c90af6ba290c0b270 (modified)
        Built by TAKUYA on Takuya-PC
        Build Path: D:\Works\Lib\Microsoft\CNTK\CNTK\Source\CNTK\
-------------------------------------------------------------------
running on Takuya-PC at 2016/03/05 01:39:39
command line: 
cntk  configFile=01_Conv2.cntk  deviceId=0

>>>>>>>>>>>>>>>>>>>> RAW CONFIG (VARIABLES NOT RESOLVED) >>>>>>>>>>>>>>>>>>>>
RootDir = "."
ConfigDir = "$RootDir$"
DataDir = "$RootDir$"
OutputDir = "$RootDir$/Output"
ModelDir = "$OutputDir$/Models"
ndlMacros="$ConfigDir$/Macros.ndl"
precision="float"
deviceId=0
prefetch=true
command=Train:Test
stderr="$OutputDir$/01_Conv"
traceLevel=1
numMBsToShowResult=500
Train=[
    action="train"
    modelPath="$ModelDir$/01_Convolution"
     NDLNetworkBuilder=[
        networkDescription="$ConfigDir$/01_Convolution.ndl"
    ]
    SGD=[
        epochSize=49984
        minibatchSize=64
        learningRatesPerMB=0.01*10:0.003*10:0.001
        momentumPerMB=0.9*20:0.99
        maxEpochs=30
        L2RegWeight=0.03
        dropoutRate=0*5:0.5
    ]
    reader=[
        readerType="UCIFastReader"
        file="$DataDir$/Train.txt"
        features=[
            dim=3072
            start=2
        ]
        labels=[
            dim=1
            start=1
            labelDim=100
            labelMappingFile="$DataDir$/fine_label_names.txt"
        ]
    ]    
]
Test=[
    action="test"
    modelPath="$ModelDir$/01_Convolution"
    minibatchSize=16
    reader=[
        readerType="UCIFastReader"
        file="$DataDir$/Test.txt"
        features=[
            dim=3072
            start=2
        ]
        labels=[
            dim=1
            start=1
            labelDim=100
            labelMappingFile="$DataDir$/fine_label_names.txt"
        ]
    ]    
]
deviceId=0

<<<<<<<<<<<<<<<<<<<< RAW CONFIG (VARIABLES NOT RESOLVED)  <<<<<<<<<<<<<<<<<<<<

>>>>>>>>>>>>>>>>>>>> RAW CONFIG WITH ALL VARIABLES RESOLVED >>>>>>>>>>>>>>>>>>>>
RootDir = "."
ConfigDir = "."
DataDir = "."
OutputDir = "./Output"
ModelDir = "./Output/Models"
ndlMacros="./Macros.ndl"
precision="float"
deviceId=0
prefetch=true
command=Train:Test
stderr="./Output/01_Conv"
traceLevel=1
numMBsToShowResult=500
Train=[
    action="train"
    modelPath="./Output/Models/01_Convolution"
     NDLNetworkBuilder=[
        networkDescription="./01_Convolution.ndl"
    ]
    SGD=[
        epochSize=49984
        minibatchSize=64
        learningRatesPerMB=0.01*10:0.003*10:0.001
        momentumPerMB=0.9*20:0.99
        maxEpochs=30
        L2RegWeight=0.03
        dropoutRate=0*5:0.5
    ]
    reader=[
        readerType="UCIFastReader"
        file="./Train.txt"
        features=[
            dim=3072
            start=2
        ]
        labels=[
            dim=1
            start=1
            labelDim=100
            labelMappingFile="./fine_label_names.txt"
        ]
    ]    
]
Test=[
    action="test"
    modelPath="./Output/Models/01_Convolution"
    minibatchSize=16
    reader=[
        readerType="UCIFastReader"
        file="./Test.txt"
        features=[
            dim=3072
            start=2
        ]
        labels=[
            dim=1
            start=1
            labelDim=100
            labelMappingFile="./fine_label_names.txt"
        ]
    ]    
]
deviceId=0

<<<<<<<<<<<<<<<<<<<< RAW CONFIG WITH ALL VARIABLES RESOLVED <<<<<<<<<<<<<<<<<<<<

>>>>>>>>>>>>>>>>>>>> PROCESSED CONFIG WITH ALL VARIABLES RESOLVED >>>>>>>>>>>>>>>>>>>>
configparameters: 01_Conv2.cntk:command=Train:Test
configparameters: 01_Conv2.cntk:ConfigDir=.
configparameters: 01_Conv2.cntk:DataDir=.
configparameters: 01_Conv2.cntk:deviceId=0
configparameters: 01_Conv2.cntk:ModelDir=./Output/Models
configparameters: 01_Conv2.cntk:ndlMacros=./Macros.ndl
configparameters: 01_Conv2.cntk:numMBsToShowResult=500
configparameters: 01_Conv2.cntk:OutputDir=./Output
configparameters: 01_Conv2.cntk:precision=float
configparameters: 01_Conv2.cntk:prefetch=true
configparameters: 01_Conv2.cntk:RootDir=.
configparameters: 01_Conv2.cntk:stderr=./Output/01_Conv
configparameters: 01_Conv2.cntk:Test=[
    action="test"
    modelPath="./Output/Models/01_Convolution"
    minibatchSize=16
    reader=[
        readerType="UCIFastReader"
        file="./Test.txt"
        features=[
            dim=3072
            start=2
        ]
        labels=[
            dim=1
            start=1
            labelDim=100
            labelMappingFile="./fine_label_names.txt"
        ]
    ]    
]

configparameters: 01_Conv2.cntk:traceLevel=1
configparameters: 01_Conv2.cntk:Train=[
    action="train"
    modelPath="./Output/Models/01_Convolution"
     NDLNetworkBuilder=[
        networkDescription="./01_Convolution.ndl"
    ]
    SGD=[
        epochSize=49984
        minibatchSize=64
        learningRatesPerMB=0.01*10:0.003*10:0.001
        momentumPerMB=0.9*20:0.99
        maxEpochs=30
        L2RegWeight=0.03
        dropoutRate=0*5:0.5
    ]
    reader=[
        readerType="UCIFastReader"
        file="./Train.txt"
        features=[
            dim=3072
            start=2
        ]
        labels=[
            dim=1
            start=1
            labelDim=100
            labelMappingFile="./fine_label_names.txt"
        ]
    ]    
]

<<<<<<<<<<<<<<<<<<<< PROCESSED CONFIG WITH ALL VARIABLES RESOLVED <<<<<<<<<<<<<<<<<<<<
Commands: Train Test 
Precision = "float"
CNTKModelPath: ./Output/Models/01_Convolution
CNTKCommandTrainInfo: Train : 30
CNTKCommandTrainInfo: CNTKNoMoreCommands_Total : 30

##############################################################################
#                                                                            #
# Action "train"                                                             #
#                                                                            #
##############################################################################

CNTKCommandTrainBegin: Train
NDLBuilder Using GPU 0
Reading UCI file ./Train.txt
Microsoft::MSR::CNTK::GPUMatrix<ElemType>::SetGaussianRandomValue (GPU): creating curand object with seed 1, sizeof(ElemType)==4

Post-processing network...

3 roots:
    CE = CrossEntropyWithSoftmax
    Err = ErrorPrediction
    OutputNodes.z = Plus
FormNestedNetwork: WARNING: Was called twice for CE CrossEntropyWithSoftmax operation
FormNestedNetwork: WARNING: Was called twice for Err ErrorPrediction operation
FormNestedNetwork: WARNING: Was called twice for OutputNodes.z Plus operation

Validating network. 34 nodes to process in pass 1.

Validating --> labels = InputValue -> [100 x *]
Validating --> OutputNodes.W = LearnableParameter -> [100 x 64]
Validating --> h1.W = LearnableParameter -> [64 x 64 x 3 x 3]
Validating --> conv3_act.W = LearnableParameter -> [64 x 800]
Validating --> conv2_act.W = LearnableParameter -> [32 x 800]
Validating --> conv1_act.W = LearnableParameter -> [32 x 75]
Validating --> features = InputValue -> [32 x 32 x 3 x *]
Validating --> featOffs = LearnableParameter -> [1 x 1]
Validating --> featScaled = Minus(features[32 x 32 x 3 x * {W=32, H=3, C=32}], featOffs[1 x 1]) -> [32 x 32 x 3 x *]
Validating --> conv1_act.c = Convolution(conv1_act.W[32 x 75], featScaled[32 x 32 x 3 x * {W=32, H=3, C=32}]) -> [32 x 32 x 32 x *]
Validating --> conv1_act.b = LearnableParameter -> [1 x 1 x 32]
Validating --> conv1_act.p = Plus(conv1_act.c[32 x 32 x 32 x * {W=32, H=32, C=32}], conv1_act.b[1 x 1 x 32]) -> [32 x 32 x 32 x *]
Validating --> conv1_act.y = RectifiedLinear(conv1_act.p[32 x 32 x 32 x * {W=32, H=32, C=32}]) -> [32 x 32 x 32 x *]
Validating --> pool1 = MaxPooling(conv1_act.y[32 x 32 x 32 x * {W=32, H=32, C=32}]) -> [15 x 15 x 32 x *]
Validating --> conv2_act.c = Convolution(conv2_act.W[32 x 800], pool1[15 x 15 x 32 x * {W=15, H=32, C=15}]) -> [15 x 15 x 32 x *]
Validating --> conv2_act.b = LearnableParameter -> [1 x 1 x 32]
Validating --> conv2_act.p = Plus(conv2_act.c[15 x 15 x 32 x * {W=15, H=32, C=15}], conv2_act.b[1 x 1 x 32]) -> [15 x 15 x 32 x *]
Validating --> conv2_act.y = RectifiedLinear(conv2_act.p[15 x 15 x 32 x * {W=15, H=32, C=15}]) -> [15 x 15 x 32 x *]
Validating --> pool2 = MaxPooling(conv2_act.y[15 x 15 x 32 x * {W=15, H=32, C=15}]) -> [7 x 7 x 32 x *]
Validating --> conv3_act.c = Convolution(conv3_act.W[64 x 800], pool2[7 x 7 x 32 x * {W=7, H=32, C=7}]) -> [7 x 7 x 64 x *]
Validating --> conv3_act.b = LearnableParameter -> [1 x 1 x 64]
Validating --> conv3_act.p = Plus(conv3_act.c[7 x 7 x 64 x * {W=7, H=64, C=7}], conv3_act.b[1 x 1 x 64]) -> [7 x 7 x 64 x *]
Validating --> conv3_act.y = RectifiedLinear(conv3_act.p[7 x 7 x 64 x * {W=7, H=64, C=7}]) -> [7 x 7 x 64 x *]
Validating --> pool3 = MaxPooling(conv3_act.y[7 x 7 x 64 x * {W=7, H=64, C=7}]) -> [3 x 3 x 64 x *]
Validating --> h1.t = Times(h1.W[64 x 64 x 3 x 3], pool3[3 x 3 x 64 x * {W=3, H=64, C=3}])
EXCEPTION occurred: h1.t Times operation: Left [64 x 64 x 3 x 3] and right [3 x 3 x 64] operands' shapes are not compatible.

[CALL STACK]
    > Microsoft::MSR::CNTK::ComputationNetwork::  ValidateNodes
    - Microsoft::MSR::CNTK::ComputationNetwork::  ValidateNetwork
    - Microsoft::MSR::CNTK::ComputationNetwork::  CompileNetwork
    - Microsoft::MSR::CNTK::NDLBuilder<float>::  LoadFromConfig
    - Microsoft::MSR::CNTK::NDLBuilder<float>::  LoadNetworkFromConfig
    - Microsoft::MSR::CNTK::NDLBuilder<float>::  BuildNetworkFromDescription
    - <lambda_129c9d8d27039b4c7cf30f939660c017>::  operator  ()
    - std::_Callable_obj<<lambda_129c9d8d27039b4c7cf30f939660c017>,0>::_ApplyX<std::shared_ptr<Microsoft::MSR::CNTK::ComputationNetwork>,int>  
    - std::_Func_impl<std::_Callable_obj<<lambda_ba19cc5edeff603974c9d97d3d5c52ff>,0>,std::allocator<std::_Func_class<std::shared_ptr<Microsoft::MSR::CNTK::ComputationNetwork>,int>>,std::shared_ptr<Microsoft::MSR::CNTK::ComputationNetwork>,int>::  _Do_call
    - std::_Func_class<std::shared_ptr<Microsoft::MSR::CNTK::ComputationNetwork>,int>::  operator  ()
    - Microsoft::MSR::CNTK::SGD<float>::  Train
    - DoTrain<Microsoft::MSR::CNTK::ConfigParameters,float>  
    - DoCommands<float>  
    - wmainOldCNTKConfig
    - wmain1
    - wmain

What did I miss your advice?

Thank you for your cooperation.

Hello, Are you using legacy or cudnn layout? If it's cudnn and you are running on GPU then in DNNImageReLULayer code you should have imageLayout="cudnn"

Hello Alexey

Thanks a lot. Your advice saved me!!!!

I'm very sorry to trouble you.

Your advice means that developer must change *.ndl file based on CPU/GPU mode? Or CNTK support to change imageLayout value from outside?

Yes, you can use a variable that you set on the command line, e.g. in NDL say something like

features = ImageInput (imageW, imageH, 1, imageLayout=**$imageLayout$**)

where $imageLayout$ will get replaced by the content of a variable called 'imageLayout' (which, incidentally, is completely independent of that optional parameter of ImageInput() which is also called 'imageLayout'--different name space).

Then add this to the command line:

imageLayout="cudnn"

Ideally we should be able to set this automatically, and this should be considered a temporary solution until we have a CPU equivalent of cuDNN's convolution functions ready.

Sorry, Markdown did not show correctly. This should just be

features = ImageInput (imageW, imageH, 1, imageLayout=$imageLayout$)

Thanks so much!!

I guess issue owner will be able to resolve original issue.

Just FYI, CIFAR-10 samples should be working now both on GPU and CPU. Note that for CPU you still need to provide the following command line arguments (or change it right at the beginning of corresponding .cntk file): configFile=.... deviceId=-1 imageLayout="legacy"

microsoft / CNTK

CIFAR10 Example: Convolution operation currently only supports 1D or 2D convolution on 3D tensors. #194