deeplearning4j / deeplearning4j

Suite of tools for deploying and training deep learning models using the JVM. Highlights include model import for keras, tensorflow, and onnx/pytorch, a modular and tiny c++ library for running math code and a java based math library on top of the core c++ library. Also includes samediff: a pytorch/tensorflow like library for running deep learn...
http://deeplearning4j.konduit.ai
Apache License 2.0
13.61k stars 3.83k forks source link

DataSet.next() raises NullPointerException #2235

Closed JieZou1 closed 7 years ago

JieZou1 commented 7 years ago

I just want to test a ConvNet model with a set of images under a folder. Here is my trial,

    String modelFile = "LeNet5.model";
    model = ModelSerializer.restoreMultiLayerNetwork(modelFile);

    FileSplit filesInDir = new FileSplit(new File(testImageFolder), allowedExtensions);
    ImageRecordReader testRecordReader = new ImageRecordReader(height, width, channels);
    testRecordReader.initialize(filesInDir);

    ImagePreProcessingScaler myScaler = new ImagePreProcessingScaler(0, 1);
    DataSetIterator testDataIter = new RecordReaderDataSetIterator(testRecordReader, 64, 1, 2);
    testDataIter.setPreProcessor(myScaler);

    DataSet ds = testDataIter.next();
    INDArray featureMatrix = ds.getFeatureMatrix();
    INDArray result = model.output(featureMatrix, false);

It causes a java.lang.NullPointerException, as listed below

Exception in thread "main" java.lang.NullPointerException at org.nd4j.linalg.dataset.DataSet.merge(DataSet.java:137) at org.nd4j.linalg.dataset.DataSet.merge(DataSet.java:348) at org.deeplearning4j.datasets.datavec.RecordReaderDataSetIterator.next(RecordReaderDataSetIterator.java:161) at org.deeplearning4j.datasets.datavec.RecordReaderDataSetIterator.next(RecordReaderDataSetIterator.java:333) at org.deeplearning4j.datasets.datavec.RecordReaderDataSetIterator.next(RecordReaderDataSetIterator.java:46)

Do the labels of the DataSet have to exist? The images are supposed to be unlabeled images, so no labels are known. Anyone please give me some hints? Many thanks.

BTW, I have made sure that everything is working fine by using INDArray, as below:

    File[] testImageFiles = (new File(testImageFolder)).listFiles(new FilenameFilter() {
        public boolean accept(File dir, String name) {return name.toLowerCase().endsWith(".png");}});
    opencv_core.Mat[] images = new opencv_core.Mat[testImageFiles.length];
    for (int i = 0; i < testImageFiles.length; i++)
    {
        String testImageFile = testImageFiles[i].getAbsolutePath();
        images[i] = opencv_imgcodecs.imread(testImageFile);
    }

    log.info("Construct Mats to INDArrays...");
    NativeImageLoader imageLoader = new NativeImageLoader(width, height, channels);
    List<INDArray> slices = new ArrayList<>();
    for (int i = 0; i < images.length; i++)
    {
        INDArray arr = imageLoader.asMatrix(images[i]);
        slices.add(arr);
    }
    INDArray imageSet = new NDArray(slices, new int[] {images.length, channels, width, height});
    imageSet.divi(255.0);

    log.info("Predict...");
    INDArray results = model.output(imageSet);

BTW, is this implementation through INDArray efficient? Are there other more efficient solutions? Thanks.

JieZou1 commented 7 years ago

It seems that adding labelMaker as follows would solve the problem:

    ParentPathLabelGenerator labelMaker = new ParentPathLabelGenerator(); //We have to use labelMaker, otherwise testDataIter.next() will raise null exception
    ImageRecordReader testRecordReader = new ImageRecordReader(height, width, channels, labelMaker);

But, could anyone give me some explanations? I guess it is due to the labels are not set. But, is it true that DataSet must require labels to be set?

samuelwaskow commented 7 years ago

Hi, I'm facing the same issue but with a different configuration.


public class Cifar {

    private static final int WIDTH = 32;
    private static final int HEIGHT = 32;

    private static final int BATCH_SIZE = 100;
    private static final int ITERATIONS = 10;

    private static final int SEED = 123;

    private static final List<String> LABELS = Arrays.asList("airplane", "automobile", "bird", "cat", "deer", "dog", "frog", "horse", "ship", "truck");

    private static final Logger log = LoggerFactory.getLogger(Cifar.class);

    public static void main(String[] args) throws Exception {

        int splitTrainNum = (int) (BATCH_SIZE * 0.8);

        DataSet cifarDataSet;
        SplitTestAndTrain trainAndTest;
        DataSet trainInput;
        final List<INDArray> testInput = new ArrayList<>();
        final List<INDArray> testLabels = new ArrayList<>();

        Nd4j.ENFORCE_NUMERICAL_STABILITY = true;

        RecordReader recordReader = loadData();
        DataSetIterator dataSetIterator = new RecordReaderDataSetIterator(recordReader, BATCH_SIZE, 1024, 10);

        MultiLayerNetwork model = new MultiLayerNetwork(getConfiguration());
          model.setListeners(new ScoreIterationListener(1));
        model.init();

        log.info("Train model");
        while (dataSetIterator.hasNext()) {
            cifarDataSet = dataSetIterator.next();
            trainAndTest = cifarDataSet.splitTestAndTrain(splitTrainNum, new Random(SEED));
            trainInput = trainAndTest.getTrain();
            testInput.add(trainAndTest.getTest().getFeatureMatrix());
            testLabels.add(trainAndTest.getTest().getLabels());
            model.fit(trainInput);
        }

        log.info("Evaluate model");
        Evaluation eval = new Evaluation(LABELS.size());
        for (int i = 0; i < testInput.size(); i++) {
            INDArray output = model.output(testInput.get(i));
            eval.eval(testLabels.get(i), output);
        }

        log.info(eval.stats());
    }

    public static MultiLayerConfiguration getConfiguration() {
        MultiLayerConfiguration conf = new NeuralNetConfiguration.Builder()
                .seed(SEED)
                .iterations(ITERATIONS)
                .momentum(0.9)
                .regularization(true)
                .optimizationAlgo(OptimizationAlgorithm.STOCHASTIC_GRADIENT_DESCENT)
                .list()
                .layer(0, new ConvolutionLayer.Builder(new int[]{5, 5})
                        .nIn(1)
                        .nOut(20)
                        .stride(new int[]{1, 1})
                        .activation("relu")
                        .weightInit(WeightInit.XAVIER)
                        .build())
                .layer(1, new SubsamplingLayer.Builder(SubsamplingLayer.PoolingType.MAX, new int[]{2, 2})
                        .build())
                .layer(2, new ConvolutionLayer.Builder(new int[]{5, 5})
                        .nIn(20)
                        .nOut(40)
                        .stride(new int[]{1, 1})
                        .activation("relu")
                        .weightInit(WeightInit.XAVIER)
                        .build())
                .layer(3, new SubsamplingLayer.Builder(SubsamplingLayer.PoolingType.MAX, new int[]{2, 2})
                        .build())
                .layer(4, new DenseLayer.Builder()
                        .nIn(40 * 5 * 5)
                        .nOut(1000)
                        .activation("relu")
                        .weightInit(WeightInit.XAVIER)
                        .dropOut(0.5)
                        .build())
                .layer(5, new OutputLayer.Builder(LossFunctions.LossFunction.MSE)
                        .nIn(1000)
                        .nOut(LABELS.size())
                        .dropOut(0.5)
                        .weightInit(WeightInit.XAVIER)
                        .build())
                .inputPreProcessor(0, new FeedForwardToCnnPreProcessor(WIDTH, HEIGHT, 1))
                .inputPreProcessor(4, new CnnToFeedForwardPreProcessor())
                .backprop(true).pretrain(false)
                .build();

        return conf;
    }

    public static RecordReader loadData() throws Exception {

        ParentPathLabelGenerator labelMaker = new ParentPathLabelGenerator();
        RecordReader imageReader = new ImageRecordReader(32, 32, 1, labelMaker);
        imageReader.initialize(new FileSplit(new File(System.getProperty("user.home"), "/Downloads/cifar/img/train")));

        RecordReader labelsReader = new CSVRecordReader();
        labelsReader.initialize(new FileSplit(new File(System.getProperty("user.home"), "/Downloads/cifar/labels.csv")));

        return new ComposableRecordReader(imageReader, labelsReader);
    }
}

[main] INFO org.nd4j.nativeblas.NativeOps - Number of threads used for NativeOps: 4
Unable to guess runtime. Please set OMP_NUM_THREADS or equivalent manually.
[main] INFO org.nd4j.nativeblas.Nd4jBlas - Number of threads used for BLAS: 4
[main] INFO org.reflections.Reflections - Reflections took 243 ms to scan 10 urls, producing 121 keys and 415 values 
[main] INFO org.reflections.Reflections - Reflections took 8583 ms to scan 188 urls, producing 6168 keys and 47911 values 
[main] INFO org.reflections.Reflections - Reflections took 5622 ms to scan 188 urls, producing 6168 keys and 47911 values 
[main] WARN org.deeplearning4j.nn.conf.NeuralNetConfiguration - Layer "Layer not named" momentum has been set but will not be applied unless the updater is set to NESTEROVS.
[main] WARN org.deeplearning4j.nn.conf.NeuralNetConfiguration - Layer "Layer not named" regularization is set to true but l1, l2 or dropout has not been added to configuration.
[main] WARN org.deeplearning4j.nn.conf.NeuralNetConfiguration - Layer "Layer not named" momentum has been set but will not be applied unless the updater is set to NESTEROVS.
[main] WARN org.deeplearning4j.nn.conf.NeuralNetConfiguration - Layer "Layer not named" regularization is set to true but l1, l2 or dropout has not been added to configuration.
[main] WARN org.deeplearning4j.nn.conf.NeuralNetConfiguration - Layer "Layer not named" momentum has been set but will not be applied unless the updater is set to NESTEROVS.
[main] WARN org.deeplearning4j.nn.conf.NeuralNetConfiguration - Layer "Layer not named" regularization is set to true but l1, l2 or dropout has not been added to configuration.
[main] WARN org.deeplearning4j.nn.conf.NeuralNetConfiguration - Layer "Layer not named" momentum has been set but will not be applied unless the updater is set to NESTEROVS.
[main] WARN org.deeplearning4j.nn.conf.NeuralNetConfiguration - Layer "Layer not named" regularization is set to true but l1, l2 or dropout has not been added to configuration.
[main] WARN org.deeplearning4j.nn.conf.NeuralNetConfiguration - Layer "Layer not named" momentum has been set but will not be applied unless the updater is set to NESTEROVS.
[main] WARN org.deeplearning4j.nn.conf.NeuralNetConfiguration - Layer "Layer not named" momentum has been set but will not be applied unless the updater is set to NESTEROVS.
[main] INFO org.reflections.Reflections - Reflections took 40 ms to scan 10 urls, producing 121 keys and 415 values 
[main] INFO com.bloase.centeraccounting.ai.nn.Cifar - Train model
Exception in thread "main" java.lang.NullPointerException
    at org.nd4j.linalg.dataset.DataSet.merge(DataSet.java:137)
    at org.nd4j.linalg.dataset.DataSet.merge(DataSet.java:348)
    at org.deeplearning4j.datasets.datavec.RecordReaderDataSetIterator.next(RecordReaderDataSetIterator.java:161)
    at org.deeplearning4j.datasets.datavec.RecordReaderDataSetIterator.next(RecordReaderDataSetIterator.java:333)
    at org.deeplearning4j.datasets.datavec.RecordReaderDataSetIterator.next(RecordReaderDataSetIterator.java:46)
    at xxx.ai.nn.Cifar.main(Cifar.java:73)
------------------------------------------------------------------------
BUILD FAILURE
------------------------------------------------------------------------
Total time: 21.819s
Finished at: Mon Nov 07 18:35:15 BRST 2016
Final Memory: 12M/224M
------------------------------------------------------------------------
lock[bot] commented 5 years ago

This thread has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs.