deeplearning4j / deeplearning4j-examples

Deeplearning4j Examples (DL4J, DL4J Spark, DataVec)
http://deeplearning4j.konduit.ai
Other
2.45k stars 1.82k forks source link

Exceptions in MultiClassLogit example with fix proposals and warning in IteratorDataSetIterator. #938

Open rperkow opened 4 years ago

rperkow commented 4 years ago

[ INFO ] Deeplearning4j version: 1.0.0-beta5 Platform information: Windows 10 CUDA version: No NVIDIA driver version: No

Issue #1: File: .../src/main/java/org/deeplearning4j/examples/dataexamples/MultiClassLogit.java

[ EXCEPTION ] Exception in thread "main" java.lang.IllegalStateException: Indices are out of range: Cannot get interval index Interval(b=0,e=2,s=1) on array with size(0)=1. Array shape: [1], indices: [Interval(b=0,e=2,s=1)] at org.nd4j.linalg.api.ndarray.BaseNDArray.get(BaseNDArray.java:4253) at org.nd4j.linalg.dataset.DataSet.getRange(DataSet.java:234) at org.deeplearning4j.examples.mytests.IteratorDataSetIterator.next(IteratorDataSetIterator.java:92) at org.deeplearning4j.examples.mytests.IteratorDataSetIterator.next(IteratorDataSetIterator.java:68) at org.deeplearning4j.examples.mytests.IteratorDataSetIterator.next(IteratorDataSetIterator.java:1) at org.deeplearning4j.examples.mytests.MultiClassLogit.getIrisDataSet(MultiClassLogit.java:89) at org.deeplearning4j.examples.mytests.MultiClassLogit.main(MultiClassLogit.java:60)

[ CODE ] irisDataSet = iter.next(); at org.deeplearning4j.examples.mytests.MultiClassLogit.getIrisDataSet(MultiClassLogit.java:89)

int nExamples = next.numExamples(); // => 4 instead of 1 at org.deeplearning4j.examples.mytests.IteratorDataSetIterator.next(IteratorDataSetIterator.java:85)

inputColumns = (int) temp.getFeatures().size(1); // => rank 1 instead of 2 at org.deeplearning4j.examples.mytests.IteratorDataSetIterator.next(IteratorDataSetIterator.java:105)

[FIX] Features array in DataSet should have rank at least 2. Shape (with additional dimension) should be explicitly specified for input data (i.e. one-dimensional double array).

return new DataSet( Nd4j.create(Arrays.copyOfRange(parsedRows, 0, columns - 1), new long[] { 1, columns - 1 }), Nd4j.create(Arrays.copyOfRange(parsedRows, columns - 1, columns), new long[] { 1, 1 }));

but it reveals the next issue...

Issue #2: File: .../src/main/java/org/deeplearning4j/examples/dataexamples/MultiClassLogit.java

[ EXCEPTION ] Error at [D:/jenkins/ws/dl4j-deeplearning4j-1.0.0-beta5-windows-x86_64-cpu/libnd4j/include/ops/declarable/generic/transforms/concat.cpp:78:0]: CONCAT op: all of input arrays must have same type ! Exception in thread "main" java.lang.RuntimeException: Op [concat] execution failed at org.nd4j.linalg.cpu.nativecpu.ops.NativeOpExecutioner.exec(NativeOpExecutioner.java:1710) at org.nd4j.linalg.factory.Nd4j.exec(Nd4j.java:6606) at org.nd4j.linalg.cpu.nativecpu.CpuNDArrayFactory.concat(CpuNDArrayFactory.java:557) at org.nd4j.linalg.factory.Nd4j.concat(Nd4j.java:4917) at org.nd4j.linalg.factory.BaseNDArrayFactory.hstack(BaseNDArrayFactory.java:963) at org.nd4j.linalg.factory.Nd4j.hstack(Nd4j.java:4673) at org.deeplearning4j.examples.mytests.MultiClassLogit.prependConstant(MultiClassLogit.java:168) at org.deeplearning4j.examples.mytests.MultiClassLogit.trainModel(MultiClassLogit.java:125) at org.deeplearning4j.examples.mytests.MultiClassLogit.main(MultiClassLogit.java:73) Caused by: java.lang.RuntimeException: Op validation failed at org.nd4j.linalg.cpu.nativecpu.ops.NativeOpExecutioner.exec(NativeOpExecutioner.java:2006) at org.nd4j.linalg.cpu.nativecpu.ops.NativeOpExecutioner.exec(NativeOpExecutioner.java:1700) ... 8 more

[ CODE ] return Nd4j.hstack(Nd4j.ones(dataset.getFeatures().size(0), 1), dataset.getFeatures()); at org.deeplearning4j.examples.mytests.MultiClassLogit.prependConstant(MultiClassLogit.java:168)

return getExecutioner().exec(op); at org.nd4j.linalg.factory.Nd4j.exec(Nd4j.java:6606)

val result = exec(op, context); at org.nd4j.linalg.cpu.nativecpu.ops.NativeOpExecutioner.exec(NativeOpExecutioner.java:1700)

throw new RuntimeException("Op [" + name + "] execution failed", e); at org.nd4j.linalg.cpu.nativecpu.ops.NativeOpExecutioner.exec(NativeOpExecutioner.java:1710)

[ FIX ] Default data type for Nd4j arrays is FLOAT (e.g. for Nd4j.rand/.zeros/.ones etc.). In MultiClassLogit example String stream is converted to double array using lambda expression (mapRowToDataSet). It seems that incompatibility between DOUBLE and FLOAT types of Nd4j arrays causes problems with some operations on a native code level. Unfortunately, standard Java library offers streams (using wrapper types) only for 3 out of 8 primitive data types (yes: int/long/double, no: short/byte/char/boolean/float). I have no idea why. "mapToFloat" would be quite useful in this case, in order to get float array directly from String stream. There is not always need to use additional 4 bytes to represent a single number. The float type is very often more than enough.

return new DataSet( Nd4j.create(Arrays.copyOfRange(parsedRows, 0, columns - 1), new long[] { 1, columns - 1 }, DataType.FLOAT), Nd4j.create(Arrays.copyOfRange(parsedRows, columns - 1, columns), new long[] { 1, 1 }, DataType.FLOAT));

and finally, there is no next Runtime Exception but...

Issue #3: File: .../src/main/java/org/deeplearning4j/datasets/iterator/IteratorDataSetIterator.java

[ WARNING ] Breakpoint lines: DataSetIterator iter = new IteratorDataSetIterator(data.iterator(), 150); irisDataSet = iter.next();

[ CODE ] constructor... public IteratorDataSetIterator(Iterator<DataSet> iterator, int batchSize) { this.iterator = iterator; this.batchSize = batchSize; this.queued = new LinkedList<>(); }

next()... public DataSet next() { return next(batchSize); }

next(int)... public DataSet next(int num) { if (!hasNext()) throw new NoSuchElementException(); List<DataSet> list = new ArrayList<>(); int countSoFar = 0; while ((!queued.isEmpty() || iterator.hasNext()) && countSoFar < batchSize) { DataSet next; if (!queued.isEmpty()) { next = queued.removeFirst(); } else { next = iterator.next(); } int nExamples = next.numExamples(); if (countSoFar + nExamples <= batchSize) { // Add the entire DataSet as-is list.add(next); } else { // Otherwise, split it DataSet toKeep = (DataSet) next.getRange(0, batchSize - countSoFar); DataSet toCache = (DataSet) next.getRange(batchSize - countSoFar, nExamples); list.add(toKeep); queued.add(toCache); } countSoFar += nExamples; } }

[ REMARKS ] When the next() method is executed overloaded method next(int) is invoked with "batchSize" field value passed by the "num" parameter, but then "num" is never used anymore.

eraly commented 4 years ago

Hi @rafal-perkowski Thank you for taking such a close look at this.

We are discussing a fix for the rank=1 case. You can follow it here

The other two issues are fixed in a repo under my account as part of an example review/refactor. But feel free to open a PR and this will fix will make it into repo quicker!

On the num not being used - that method signature is part of the interface that has to be implemented. Note that batch size is a final class variable. The docs should probably explain that the next(num) defaults to next(batchsize)...

eraly commented 4 years ago

We can close this. Issues are fixed.

EliseuBreak commented 3 months ago

CONCAT op: all of input arrays must have same type ! Exception in thread "main" java.lang.RuntimeException: Op [concat] execution failed~

import org.nd4j.linalg.factory.Nd4j; I have this erro when a put values in method hstack, I need a help ,if have alternative to resolve this error.

for (int j = 0; j < stateBatch.size(0); j++) { INDArray state = stateBatch.getRow(j); INDArray actionAux = actionBatch.getRow(j); stateActionArray[j] = Nd4j.hstack(state, actionAux);

            System.out.println("Treinamento J: "+ j);
        }