deeplearning4j / deeplearning4j

Suite of tools for deploying and training deep learning models using the JVM. Highlights include model import for keras, tensorflow, and onnx/pytorch, a modular and tiny c++ library for running math code and a java based math library on top of the core c++ library. Also includes samediff: a pytorch/tensorflow like library for running deep learn...
http://deeplearning4j.konduit.ai
Apache License 2.0
13.6k stars 3.83k forks source link

transferLearning API throwing ND4JIllegalStateException #3302

Closed copypasteearth closed 7 years ago

copypasteearth commented 7 years ago

I'm trying to use Transfer Learning API like so

ComputationGraph net1 = ModelSerializer.restoreComputationGraph("C:\\Users\\click\\Documents\\wordVectors\\questionGraph200.txt", true);
ComputationGraph net = new TransferLearning.GraphBuilder(net1)
            //.fineTuneConfiguration(fineTuneConf)
            //.setFeatureExtractor("merge")
            .setFeatureExtractor("cnn3")
            //.setFeatureExtractor("cnn4")
            //.setFeatureExtractor("cnn5")
            .nOutReplace("out", 48, WeightInit.XAVIER)
            .build();
        TransferLearningHelper transferLearningHelper = new TransferLearningHelper(net);

and it throws an exception on TransferLearningHelper transferLearningHelper = new TransferLearningHelper(net);

here is the exception

Exception in thread "main" org.nd4j.linalg.exception.ND4JIllegalStateException: Invalid shape: Requested INDArray shape [1, 0] contains dimension size values < 1 (all dimensions must be 1 or more)
    at org.nd4j.linalg.factory.Nd4j.checkShapeValues(Nd4j.java:4776)
    at org.nd4j.linalg.factory.Nd4j.create(Nd4j.java:4766)
    at org.nd4j.linalg.cpu.nativecpu.CpuNDArrayFactory.toFlattened(CpuNDArrayFactory.java:502)
    at org.nd4j.linalg.factory.Nd4j.toFlattened(Nd4j.java:1778)
    at org.deeplearning4j.nn.layers.BaseLayer.params(BaseLayer.java:267)
    at org.deeplearning4j.nn.transferlearning.TransferLearningHelper.copyOrigParamsToSubsetGraph(TransferLearningHelper.java:416)
    at org.deeplearning4j.nn.transferlearning.TransferLearningHelper.initHelperGraph(TransferLearningHelper.java:253)
    at org.deeplearning4j.nn.transferlearning.TransferLearningHelper.<init>(TransferLearningHelper.java:66)
    at org.deeplearning4j.examples.nlp.paragraphvectors.MainTest.main(MainTest.java:55)

here is the original config

 int batchSize = 64;//32;
        int vectorSize = 100;               //Size of the word vectors. 300 in the Google News model
        int nEpochs = 1;                    //Number of epochs (full passes of training data) to train on
        int truncateReviewsToLength = 256;  //Truncate reviews with length (# words) greater than this

        int cnnLayerFeatureMaps = 100;      //Number of feature maps / channels / depth for each CNN layer
        DataSetIterator iter = QuestionIterator.getDataSetIterator(word2Vec,batchSize,truncateReviewsToLength,rand);
        PoolingType globalPoolingType = PoolingType.MAX;
        Random rng = new Random(12345);
ComputationGraphConfiguration config = new NeuralNetConfiguration.Builder()
            .weightInit(WeightInit.RELU)
            .activation(Activation.LEAKYRELU)
            .updater(Updater.ADAM)
            .convolutionMode(ConvolutionMode.Same)      //This is important so we can 'stack' the results later
            .regularization(true).l2(0.0001)
            .learningRate(0.01)
            .graphBuilder()
            .addInputs("input")
            .addLayer("cnn3", new ConvolutionLayer.Builder()
                .kernelSize(3,vectorSize)
                .stride(1,vectorSize)
                .nIn(1)
                .nOut(cnnLayerFeatureMaps)
                .build(), "input")
            .addLayer("cnn4", new ConvolutionLayer.Builder()
                .kernelSize(4,vectorSize)
                .stride(1,vectorSize)
                .nIn(1)
                .nOut(cnnLayerFeatureMaps)
                .build(), "input")
            .addLayer("cnn5", new ConvolutionLayer.Builder()
                .kernelSize(5,vectorSize)
                .stride(1,vectorSize)
                .nIn(1)
                .nOut(cnnLayerFeatureMaps)
                .build(), "input")
            .addVertex("merge", new MergeVertex(), "cnn3", "cnn4", "cnn5")      //Perform depth concatenation
            .addLayer("globalPool", new GlobalPoolingLayer.Builder()
                .poolingType(globalPoolingType)
                .build(), "merge")
            .addLayer("out", new OutputLayer.Builder()
                .lossFunction(LossFunctions.LossFunction.MCXENT)
                .activation(Activation.SOFTMAX)
                .nIn(3*cnnLayerFeatureMaps)
                .nOut(47)    //2 classes: positive or negative
                .build(), "globalPool")
            .setOutputs("out")
            .build();
eraly commented 7 years ago

Please also add your conf to the issue

copypasteearth commented 7 years ago

I just added it

eraly commented 7 years ago

Thanks!

copypasteearth commented 7 years ago

questionGraph200.txt here is the actual compgraph if youd like to load it and debug or find the problem somehow

copypasteearth commented 7 years ago

I went through my corpus with my word2vec and removed all of the words that returned a null indarray and retrained a compgraph with the new corpus but the same error happened when i tried to load it into transferlearninghelper

liangzu commented 7 years ago

I met the same problem, did you find the solution?

copypasteearth commented 7 years ago

No, haven't found a solution yet

superwangvip commented 7 years ago

I met the same problem

AlexDBlack commented 7 years ago

I was able to reproduce this with DL4J 0.8.0, and confirmed it is fixed on DL4J master.

It appears the problem was in BaseLayer.params() (doing an Nd4j.toFlattened on an empty/null array from global pooling), which has since been fixed.

lock[bot] commented 5 years ago

This thread has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs.