deepjavalibrary / djl

An Engine-Agnostic Deep Learning Framework in Java
https://djl.ai
Apache License 2.0
4.08k stars 650 forks source link

java.nio.ReadOnlyBufferException when calling TensorFlow model. #570

Closed macster110 closed 3 years ago

macster110 commented 3 years ago

Question

I am trying to call a Tensorflow model and keep getting a ReadOnlyBufferException. Originally this was an .hdf5 model which I converted to .pb file to use with djl as per instructions here.

The model input (in python) is a float64 numpy array of shape (N, 40,40,1). The model loads fine using djl and I've created a translator which inputs a double[][] array of the same shape but when calling NDArray array = manager.create(specgramFlat, shape); I get a ReadOnlyBufferException error.

A minimal reproducible example is below and you can download the zipped model https://1drv.ms/u/s!AkNvdu-1_rHOgahqrZwrhu6V8v3TFA?e=0BR4a3.

Am I going about loading this model the right way? Any help on this would be much appreciated. Thanks!

import java.io.File;
import java.nio.file.Path;
import java.nio.file.Paths;
import java.util.LinkedList;
import java.util.Random;

import ai.djl.Model;
import ai.djl.engine.Engine;
import ai.djl.inference.Predictor;
import ai.djl.ndarray.NDArray;
import ai.djl.ndarray.NDList;
import ai.djl.ndarray.NDManager;
import ai.djl.ndarray.types.Shape;
import ai.djl.translate.Batchifier;
import ai.djl.translate.Translator;
import ai.djl.translate.TranslatorContext;

/**
 * A minimal reproducible example of a java.nio.ReadOnlyBufferException when trying to call a tensorflow classifier. 
 * 
 * @author Jamie Macaulay 
 */
public class ReadBufferExceptionTest {

    public static void main(String[] args) {

        String modelPath = "saved_model.pb";

        try {

            //load the Tensorflow model. 
            File file = new File(modelPath); 

            Path modelDir = Paths.get(file.getAbsoluteFile().getParent()); //the directory of the file (in case the file is local this should also return absolute directory)

            System.out.println(Engine.getAllEngines()); 

            Model model = Model.newInstance(modelPath, "TensorFlow"); 

            model.load(modelDir, "saved_model.pb");

            System.out.println("Input: " + model.describeInput().values()); 
            System.out.println("Output: " + model.describeOutput().values()); 

            //create the predictor
            Translator<double[][], float[]>  translator = new Translator<double[][], float[]>() {   

                @Override
                public NDList processInput(TranslatorContext ctx, double[][] data) {
                    //System.out.println("Hello: 1 " ); 
                    NDManager manager = ctx.getNDManager();

                    Shape shape = new Shape(1L, data.length, data[0].length, 1L); 

                    System.out.println("NDArray shape: " + shape); 

                    double[] specgramFlat = flattenDoubleArray(data); 

                    NDArray array = manager.create(specgramFlat, shape); 
                    //      NDArray array = manager.create(data); 

                    System.out.println("NDArray size: " + array.size()); 

                    return new NDList (array);
                }

                @Override
                public float[]  processOutput(TranslatorContext ctx, NDList list) {
                    System.out.println("Hello: 2 " + list); 

                    NDArray temp_arr = list.get(0);

                    Number[] number = temp_arr.toArray(); 

                    float[] results = new float[number.length]; 
                    for (int i=0; i<number.length; i++) {
                        results[i] = number[i].floatValue(); 
                    }

                    return results; 
                }

                @Override
                public Batchifier getBatchifier() {
                    // The Batchifier describes how to combine a batch together
                    // Stacking, the most common batchifier, takes N [X1, X2, ...] arrays to a single [N, X1, X2, ...] array
                    return Batchifier.STACK;
                }
            };
            Predictor<double[][], float[]> predictor = model.newPredictor(translator);

            //make some fake data for input
            double[][] data = makeDummySpectrogramd(40, 40); 

            Shape shape = new Shape(1L, 40, 40, 1L); 

            System.out.println("NDArray shape: " + shape); 

            //          NDArray array = manager.create(specgramFlat, shape); 
            model.getNDManager().create(data); 

            float[] output = predictor.predict(data); 

        }
        catch (Exception e) {
            e.printStackTrace();
        }
    }

    /**
     * Make a dummy spectrogram for testing. Filled with random values.  
     * @param len - the length of the spectrogram in bins. 
     * @param height - the height of the spectrgram in bins. 
     * @return a dummy spectrogram with random values. 
     */
    public static double[][] makeDummySpectrogramd(int len, int len2){

        //      int len = 256; 
        //      int len2 = 128; 

        double[][] specDummy = new double[len][len2]; 

        Random rand = new Random(); 
        for (int i=0; i<len; i++){
            for (int j=0; j<len2; j++) {
                specDummy[i][j] = 2F*(rand.nextFloat()-0.5F);

                if (specDummy[i][j]>1) {
                    specDummy[i][j]=1F;
                }
                if (specDummy[i][j]<0) {
                    specDummy[i][j]=0F;
                }
            }
        }
        return specDummy; 
    }

    /** 
     * Convert an arbitrary-dimensional rectangular double array to flat vector.<br>
     * Can pass double[], double[][], double[][][], etc.
     */
    public static double[] flattenDoubleArray(Object doubleArray) {
        if (doubleArray instanceof double[])
            return (double[]) doubleArray;

        LinkedList<Object> stack = new LinkedList<>();
        stack.push(doubleArray);

        int[] shape = arrayShape(doubleArray);
        int length = prod(shape);
        double[] flat = new double[length];
        int count = 0;

        while (!stack.isEmpty()) {
            Object current = stack.pop();
            if (current instanceof double[]) {
                double[] arr = (double[]) current;
                for (int i = 0; i < arr.length; i++)
                    flat[count++] = arr[i];
            } else if (current instanceof Object[]) {
                Object[] o = (Object[]) current;
                for (int i = o.length - 1; i >= 0; i--)
                    stack.push(o[i]);
            } else
                throw new IllegalArgumentException("Base array is not double[]");
        }

        if (count != flat.length)
            throw new IllegalArgumentException("Fewer elements than expected. Array is ragged?");
        return flat;
    }

    /** Calculate the shape of an arbitrary multi-dimensional array. Assumes:<br>
     * (a) array is rectangular (not ragged) and first elements (i.e., array[0][0][0]...) are non-null <br>
     * (b) First elements have > 0 length. So array[0].length > 0, array[0][0].length > 0, etc.<br>
     * Can pass any Java array opType: double[], Object[][][], float[][], etc.<br>
     * Length of returned array is number of dimensions; returned[i] is size of ith dimension.
     */
    public static int[] arrayShape(Object array) {
        int nDimensions = 0;
        Class<?> c = array.getClass().getComponentType();
        while (c != null) {
            nDimensions++;
            c = c.getComponentType();
        }

        int[] shape = new int[nDimensions];
        Object current = array;
        for (int i = 0; i < shape.length - 1; i++) {
            shape[i] = ((Object[]) current).length;
            current = ((Object[]) current)[0];
        }

        if (current instanceof Object[]) {
            shape[shape.length - 1] = ((Object[]) current).length;
        } else if (current instanceof double[]) {
            shape[shape.length - 1] = ((double[]) current).length;
        } else if (current instanceof float[]) {
            shape[shape.length - 1] = ((float[]) current).length;
        } else if (current instanceof long[]) {
            shape[shape.length - 1] = ((long[]) current).length;
        } else if (current instanceof int[]) {
            shape[shape.length - 1] = ((int[]) current).length;
        } else if (current instanceof byte[]) {
            shape[shape.length - 1] = ((byte[]) current).length;
        } else if (current instanceof char[]) {
            shape[shape.length - 1] = ((char[]) current).length;
        } else if (current instanceof boolean[]) {
            shape[shape.length - 1] = ((boolean[]) current).length;
        } else if (current instanceof short[]) {
            shape[shape.length - 1] = ((short[]) current).length;
        } else
            throw new IllegalStateException("Unknown array opType"); //Should never happen
        return shape;
    }

    /**
     * Product of an int array
     * @param mult the elements
     *            to calculate the sum for
     * @return the product of this array
     */
    public static int prod(int... mult) {
        if (mult.length < 1)
            return 0;
        int ret = 1;
        for (int i = 0; i < mult.length; i++)
            ret *= mult[i];
        return ret;
    }

}
roywei commented 3 years ago

Hi @macster110, I was not able to reproduce the ReadOnlyBufferException you mentioned.

If I directly run your code with the model provided I get 2 errors, I was able to fix them and run successfully.

1. Data type error

org.tensorflow.exceptions.TFInvalidArgumentException: Expects arg[0] to be float but double is provided

This is due to the model expect float32 data type as input, but you are passing double (created in preprocess in the translator). You can fix by generate specgramFlat as float[] or convert the NDArray to float32 type like this:

NDArray array = manager.create(specgramFlat, shape).toType(DataType.FLOAT32, false)

2. Shape Error

2021-02-01 18:15:59.525914: W external/org_tensorflow/tensorflow/core/framework/op_kernel.cc:1767] OP_REQUIRES failed at conv_ops_fused_impl.h:716 : Invalid argument: input must be 4-dimensional[1,1,40,40,1]
org.tensorflow.exceptions.TFInvalidArgumentException: input must be 4-dimensional[1,1,40,40,1]

The model accepts any batch size as the required input shape is Input: [(-1, 40, 40, 1)]. Since you already prepare the input in batched shape (1,40,40,1) (batch size 1). There is no need for batchifier in translator. You can return null; in getBatchifier.

After fixing these 2 errors, I was able to run it.

Warning: Could not load Loader: java.lang.UnsatisfiedLinkError: no jnijavacpp in java.library.path
Warning: Could not load Pointer: java.lang.UnsatisfiedLinkError: no jnijavacpp in java.library.path
Warning: Could not load BytePointer: java.lang.UnsatisfiedLinkError: no jnijavacpp in java.library.path
[TensorFlow]
Warning: Could not load PointerPointer: java.lang.UnsatisfiedLinkError: no jnijavacpp in java.library.path
Input: [(-1, 40, 40, 1)]
Output: [(-1, 2)]
NDArray shape: (1, 40, 40, 1)
Warning: Could not load IntPointer: java.lang.UnsatisfiedLinkError: no jnijavacpp in java.library.path
NDArray shape: (1, 40, 40, 1)
NDArray size: 1600
Hello: 2 NDList size: 1
0 dense_2: (1, 2) float32
macster110 commented 3 years ago

Hi @roywei. thanks very much for taking the time to check this out. Unfortunately after fixing those errors I am still getting the same issue. I am using Mac OS but tried on Windows and got the same error (see full message below)

I am using Maven to run the project - the project and POM file is here - but as far as I can tell I'm using the latest version of the djl tensorflow libraries.

I'm totally stumped...any more ideas at what could be going on?

Thanks again for the help.

[PyTorch, TensorFlow] 2021-02-02 10:26:01.125793: I external/org_tensorflow/tensorflow/cc/saved_model/reader.cc:31] Reading SavedModel from: /Users/au671271/Desktop/model_lenet_dropout_input_conv_all/1 2021-02-02 10:26:01.129170: I external/org_tensorflow/tensorflow/cc/saved_model/reader.cc:54] Reading meta graph with tags { serve } 2021-02-02 10:26:01.129187: I external/org_tensorflow/tensorflow/cc/saved_model/loader.cc:250] Reading SavedModel debug info (if present) from: /Users/au671271/Desktop/model_lenet_dropout_input_conv_all/1 2021-02-02 10:26:01.145984: I external/org_tensorflow/tensorflow/cc/saved_model/loader.cc:215] Restoring SavedModel bundle. 2021-02-02 10:26:01.245650: I external/org_tensorflow/tensorflow/cc/saved_model/loader.cc:199] Running initialization op on SavedModel bundle at path: /Users/au671271/Desktop/model_lenet_dropout_input_conv_all/1 2021-02-02 10:26:01.261052: I external/org_tensorflow/tensorflow/cc/saved_model/loader.cc:319] SavedModel load for tags { serve }; Status: success: OK. Took 135259 microseconds. Input: [(-1, 40, 40, 1)] Output: [(-1, 2)] NDArray shape: (1, 40, 40, 1) java.nio.ReadOnlyBufferException at org.tensorflow.ndarray.impl.buffer.Validator.copyToArgs(Validator.java:67) at org.tensorflow.ndarray.impl.buffer.nio.ByteNioDataBuffer.copyTo(ByteNioDataBuffer.java:65) at org.tensorflow.Tensor.of(Tensor.java:186) at ai.djl.tensorflow.engine.TfNDManager.create(TfNDManager.java:218) at ai.djl.tensorflow.engine.TfNDManager.create(TfNDManager.java:46) at api@0.9.0/ai.djl.ndarray.NDManager.create(NDManager.java:409) at api@0.9.0/ai.djl.ndarray.NDManager.create(NDManager.java:348) at jdl4pam/org.jamdev.jdl4pam.genericmodel.ReadBufferExceptionTest.main(ReadBufferExceptionTest.java:107)

lanking520 commented 3 years ago

Hi @macster110 which Java version did you use?

macster110 commented 3 years ago

Good point. I am using 14.0.2 from AdoptOpenJDK. This could possibly a Java 11+ problem...

macster110 commented 3 years ago

Hi @lanking520 and @roywei. This is indeed a Java issue. Java 8 works and Java 14 and 15 do not. The project I'm working on really requires Java 11+ so would be great if there was a workaround somehow? Any ideas?

lanking520 commented 3 years ago

Hi @lanking520 and @roywei. This is indeed a Java issue. Java 8 works and Java 14 and 15 do not. The project I'm working on really requires Java 11+ so would be great if there was a workaround somehow? Any ideas?

Java 11 works. We haven't cover much tests on Java 14, will take a look today.

macster110 commented 3 years ago

Hi @lanking520 and @roywei. This is indeed a Java issue. Java 8 works and Java 14 and 15 do not. The project I'm working on really requires Java 11+ so would be great if there was a workaround somehow? Any ideas?

Java 11 works. We haven't cover much tests on Java 14, will take a look today.

Quick update, I tried Java 11 and it still has the same error on my machine? It was only Java 8 that worked for me...?

roywei commented 3 years ago

Hi, following is my test results:

Tried Java 8,11,12,15, 11 and 15 used Amazon Corretto version, but I think it should not matter. All version works.

Could you try

  1. remove the tensorflow binarys under $HOME?

rm -rf $HOME/.djl.ai/tensorflow/

  1. do a clean build

  2. Let me know how you installed Java 11 and 14 etc so I can try the exact version

  3. I see you have PyTorch and TensorFlow engine as dependency, did you tried to do anything with PyTorch before running the model? If it's just print all engine names, it should be fine.

[PyTorch, MXNet, TensorFlow]
Warning: Could not load PointerPointer: java.lang.UnsatisfiedLinkError: no jnijavacpp in java.library.path: /Users/lawei/Library/Java/Extensions:/Library/Java/Extensions:/Network/Library/Java/Extensions:/System/Library/Java/Extensions:/usr/lib/java:.
Input: [(-1, 40, 40, 1)]
Output: [(-1, 2)]
NDArray shape: (1, 40, 40, 1)
Warning: Could not load IntPointer: java.lang.UnsatisfiedLinkError: no jnijavacpp in java.library.path: /Users/lawei/Library/Java/Extensions:/Library/Java/Extensions:/Network/Library/Java/Extensions:/System/Library/Java/Extensions:/usr/lib/java:.
NDArray shape: (1, 40, 40, 1)
NDArray size: 1600
Hello: 2 NDList size: 1
0 dense_2: (1, 2) float32

result is: [1.0, 1.9944851E-11]
  1. Also try to print current engine name before prediction and see what's the result:
            System.out.println(Engine.getInstance().getEngineName());

System

MacOS 10.15.7

Code:

https://gist.github.com/roywei/69763254d5b73a524b1eec6cf28e7dc1

Java 8

java -version                                                                                                              

java version "1.8.0_231"
Java(TM) SE Runtime Environment (build 1.8.0_231-b11)
Java HotSpot(TM) 64-Bit Server VM (build 25.231-b11, mixed mode)

./gradlew run -Dmain=ai.djl.examples.inference.ReadBufferExceptionTest -Dai.djl.default_engine=TensorFlow                          

> Task :examples:run
Warning: Could not load Loader: java.lang.UnsatisfiedLinkError: no jnijavacpp in java.library.path
Warning: Could not load Pointer: java.lang.UnsatisfiedLinkError: no jnijavacpp in java.library.path
Warning: Could not load BytePointer: java.lang.UnsatisfiedLinkError: no jnijavacpp in java.library.path
[TensorFlow]
Warning: Could not load PointerPointer: java.lang.UnsatisfiedLinkError: no jnijavacpp in java.library.path
Input: [(-1, 40, 40, 1)]
Output: [(-1, 2)]
NDArray shape: (1, 40, 40, 1)
Warning: Could not load IntPointer: java.lang.UnsatisfiedLinkError: no jnijavacpp in java.library.path
NDArray shape: (1, 40, 40, 1)
NDArray size: 1600
Hello: 2 NDList size: 1
0 dense_2: (1, 2) float32

result is: [0.9999999, 1.6145458E-7]

Java 11

 java -version                                                                                                                     

openjdk version "11.0.8" 2020-07-14 LTS
OpenJDK Runtime Environment Corretto-11.0.8.10.1 (build 11.0.8+10-LTS)
OpenJDK 64-Bit Server VM Corretto-11.0.8.10.1 (build 11.0.8+10-LTS, mixed mode)

./gradlew run -Dmain=ai.djl.examples.inference.ReadBufferExceptionTest -Dai.djl.default_engine=TensorFlow                    

Starting a Gradle Daemon, 2 incompatible Daemons could not be reused, use --status for details
> Task :examples:run
Warning: Could not load Loader: java.lang.UnsatisfiedLinkError: no jnijavacpp in java.library.path: [/Users/lawei/Library/Java/Extensions, /Library/Java/Extensions, /Network/Library/Java/Extensions, /System/Library/Java/Extensions, /usr/lib/java, .]
Warning: Could not load Pointer: java.lang.UnsatisfiedLinkError: no jnijavacpp in java.library.path: [/Users/lawei/Library/Java/Extensions, /Library/Java/Extensions, /Network/Library/Java/Extensions, /System/Library/Java/Extensions, /usr/lib/java, .]
Warning: Could not load BytePointer: java.lang.UnsatisfiedLinkError: no jnijavacpp in java.library.path: [/Users/lawei/Library/Java/Extensions, /Library/Java/Extensions, /Network/Library/Java/Extensions, /System/Library/Java/Extensions, /usr/lib/java, .]
[TensorFlow]
Warning: Could not load PointerPointer: java.lang.UnsatisfiedLinkError: no jnijavacpp in java.library.path: [/Users/lawei/Library/Java/Extensions, /Library/Java/Extensions, /Network/Library/Java/Extensions, /System/Library/Java/Extensions, /usr/lib/java, .]
Input: [(-1, 40, 40, 1)]
Output: [(-1, 2)]
NDArray shape: (1, 40, 40, 1)
Warning: Could not load IntPointer: java.lang.UnsatisfiedLinkError: no jnijavacpp in java.library.path: [/Users/lawei/Library/Java/Extensions, /Library/Java/Extensions, /Network/Library/Java/Extensions, /System/Library/Java/Extensions, /usr/lib/java, .]
NDArray shape: (1, 40, 40, 1)
NDArray size: 1600
Hello: 2 NDList size: 1
0 dense_2: (1, 2) float32

result is: [1.0, 6.302518E-9]

Java 12

java -version                                                                                                                     

java version "12.0.2" 2019-07-16
Java(TM) SE Runtime Environment (build 12.0.2+10)
Java HotSpot(TM) 64-Bit Server VM (build 12.0.2+10, mixed mode, sharing)

./gradlew run -Dmain=ai.djl.examples.inference.ReadBufferExceptionTest -Dai.djl.default_engine=TensorFlow                         

Starting a Gradle Daemon, 1 incompatible Daemon could not be reused, use --status for details

> Task :examples:run
Warning: Could not load Loader: java.lang.UnsatisfiedLinkError: no jnijavacpp in java.library.path: [/Users/lawei/Library/Java/Extensions, /Library/Java/Extensions, /Network/Library/Java/Extensions, /System/Library/Java/Extensions, /usr/lib/java, .]
Warning: Could not load Pointer: java.lang.UnsatisfiedLinkError: no jnijavacpp in java.library.path: [/Users/lawei/Library/Java/Extensions, /Library/Java/Extensions, /Network/Library/Java/Extensions, /System/Library/Java/Extensions, /usr/lib/java, .]
Warning: Could not load BytePointer: java.lang.UnsatisfiedLinkError: no jnijavacpp in java.library.path: [/Users/lawei/Library/Java/Extensions, /Library/Java/Extensions, /Network/Library/Java/Extensions, /System/Library/Java/Extensions, /usr/lib/java, .]
[TensorFlow]
Warning: Could not load PointerPointer: java.lang.UnsatisfiedLinkError: no jnijavacpp in java.library.path: [/Users/lawei/Library/Java/Extensions, /Library/Java/Extensions, /Network/Library/Java/Extensions, /System/Library/Java/Extensions, /usr/lib/java, .]
Input: [(-1, 40, 40, 1)]
Output: [(-1, 2)]
NDArray shape: (1, 40, 40, 1)
Warning: Could not load IntPointer: java.lang.UnsatisfiedLinkError: no jnijavacpp in java.library.path: [/Users/lawei/Library/Java/Extensions, /Library/Java/Extensions, /Network/Library/Java/Extensions, /System/Library/Java/Extensions, /usr/lib/java, .]
NDArray shape: (1, 40, 40, 1)
NDArray size: 1600
Hello: 2 NDList size: 1
0 dense_2: (1, 2) float32

result is: [1.0, 7.4761025E-14]

Java 15

java -version                                                                                                                     

openjdk version "15" 2020-09-15
OpenJDK Runtime Environment Corretto-15.0.0.36.1 (build 15+36)
OpenJDK 64-Bit Server VM Corretto-15.0.0.36.1 (build 15+36, mixed mode, sharing)

 ./gradlew run -Dmain=ai.djl.examples.inference.ReadBufferExceptionTest -Dai.djl.default_engine=TensorFlow                    

Starting a Gradle Daemon, 3 incompatible Daemons could not be reused, use --status for details

> Task :examples:run
Warning: Could not load Loader: java.lang.UnsatisfiedLinkError: no jnijavacpp in java.library.path: /Users/lawei/Library/Java/Extensions:/Library/Java/Extensions:/Network/Library/Java/Extensions:/System/Library/Java/Extensions:/usr/lib/java:.
Warning: Could not load Pointer: java.lang.UnsatisfiedLinkError: no jnijavacpp in java.library.path: /Users/lawei/Library/Java/Extensions:/Library/Java/Extensions:/Network/Library/Java/Extensions:/System/Library/Java/Extensions:/usr/lib/java:.
Warning: Could not load BytePointer: java.lang.UnsatisfiedLinkError: no jnijavacpp in java.library.path: /Users/lawei/Library/Java/Extensions:/Library/Java/Extensions:/Network/Library/Java/Extensions:/System/Library/Java/Extensions:/usr/lib/java:.
[TensorFlow]
Warning: Could not load PointerPointer: java.lang.UnsatisfiedLinkError: no jnijavacpp in java.library.path: /Users/lawei/Library/Java/Extensions:/Library/Java/Extensions:/Network/Library/Java/Extensions:/System/Library/Java/Extensions:/usr/lib/java:.
Input: [(-1, 40, 40, 1)]
Output: [(-1, 2)]
NDArray shape: (1, 40, 40, 1)
Warning: Could not load IntPointer: java.lang.UnsatisfiedLinkError: no jnijavacpp in java.library.path: /Users/lawei/Library/Java/Extensions:/Library/Java/Extensions:/Network/Library/Java/Extensions:/System/Library/Java/Extensions:/usr/lib/java:.
NDArray shape: (1, 40, 40, 1)
NDArray size: 1600
Hello: 2 NDList size: 1
0 dense_2: (1, 2) float32

result is: [1.0, 1.896747E-8]
lanking520 commented 3 years ago

@macster110 Do you think it might be an OS issue? Which OS are you running with?

macster110 commented 3 years ago

Thanks @roywei and @lanking520 for bearing with on this issue.

So the issue seems to be the Java version and in versions that do not work (8+) then the problem is that the engine remains Pytorch. That seems weird to me because the model loads just fine but Engine.getInstance().getEngineName() returns PyTorch instead of TensorFlow?

I am using MacOS 11.1 (20C69), Java 14.0.2 from AdoptOpenJDK and running the code in Eclipse 2020-12 4.18.0. Updated code is here.

I'm guessing that this will be solved if we figure out the engine remaining PyTorch but unsure why this might be?

Thanks again for your help with this. It's much appreciated.

2021-02-05 09:41:06.673130: I external/org_tensorflow/tensorflow/core/platform/cpu_feature_guard.cc:142] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN)to use the following CPU instructions in performance-critical operations:  AVX2 FMA
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
[PyTorch, TensorFlow]
Engine name before model load: PyTorch
2021-02-05 09:41:06.714469: I external/org_tensorflow/tensorflow/cc/saved_model/reader.cc:31] Reading SavedModel from: /Users/au671271/Desktop/model_lenet_dropout_input_conv_all
2021-02-05 09:41:06.717497: I external/org_tensorflow/tensorflow/cc/saved_model/reader.cc:54] Reading meta graph with tags { serve }
2021-02-05 09:41:06.717516: I external/org_tensorflow/tensorflow/cc/saved_model/loader.cc:250] Reading SavedModel debug info (if present) from: /Users/au671271/Desktop/model_lenet_dropout_input_conv_all
2021-02-05 09:41:06.733993: I external/org_tensorflow/tensorflow/cc/saved_model/loader.cc:215] Restoring SavedModel bundle.
2021-02-05 09:41:06.833548: I external/org_tensorflow/tensorflow/cc/saved_model/loader.cc:199] Running initialization op on SavedModel bundle at path: /Users/au671271/Desktop/model_lenet_dropout_input_conv_all
2021-02-05 09:41:06.849418: I external/org_tensorflow/tensorflow/cc/saved_model/loader.cc:319] SavedModel load for tags { serve }; Status: success: OK. Took 134949 microseconds.
Engine name after model load: PyTorch
Input: [(-1, 40, 40, 1)]
Output: [(-1, 2)]
Engine name before prediciton: PyTorch
NDArray shape: (1, 40, 40, 1)
java.nio.ReadOnlyBufferException
    at org.tensorflow.ndarray.impl.buffer.Validator.copyToArgs(Validator.java:67)
    at org.tensorflow.ndarray.impl.buffer.nio.ByteNioDataBuffer.copyTo(ByteNioDataBuffer.java:65)
    at org.tensorflow.Tensor.of(Tensor.java:186)
    at ai.djl.tensorflow.engine.TfNDManager.create(TfNDManager.java:218)
    at ai.djl.tensorflow.engine.TfNDManager.create(TfNDManager.java:46)
    at api@0.9.0/ai.djl.ndarray.NDManager.create(NDManager.java:409)
    at api@0.9.0/ai.djl.ndarray.NDManager.create(NDManager.java:348)
    at jdl4pam/org.jamdev.jdl4pam.genericmodel.ReadBufferExceptionTest.main(ReadBufferExceptionTest.java:114)
frankfliu commented 3 years ago

@macster110 If you have multiple engine in the classpath, the default engine is undefined, it's purely depends on the order or the class is get loaded. I just create a PR to address this issue, with latest code, the default engine will be deterministic. But this seems not related to the issue you are facing.

It looks like the issue here is:

  1. Multiple engine are loaded in classpath
  2. The default engine was set to PyTorch, this can be set explicitly by "-Dai.djl.default_engine=PyTorch"
  3. The code load the TensorFlow model explicitly with "TensorFlow"

If we can reproduce this, then there is a bug somewhere when multiple engine are loaded.

You can try "-Dai.djl.default_engine=TensorFlow" and see if your issue goes awasy.

roywei commented 3 years ago

This is a really strange error. I was able to load PyTorch and TF engine together and run successfully with your latest code. Actually, all 3 engines loaded together also work. Since the NDArray creation is in Translators preprocessInput, it uses the context manager, which is specified to use TF engine during model loading. So we are NOT feeding any NDArray created by PyTorch to TFModel.

Next step we will try upgrading MacOS to 11.1 and see if we can reproduce it. Are you using Mac with M1 chips?

> Task :examples:ReadBufferExceptionTest.main()
[INFO ] - Number of inter-op threads is 6
[INFO ] - Number of intra-op threads is 6
Warning: Could not load Loader: java.lang.UnsatisfiedLinkError: no jnijavacpp in java.library.path
Warning: Could not load Pointer: java.lang.UnsatisfiedLinkError: no jnijavacpp in java.library.path
Warning: Could not load BytePointer: java.lang.UnsatisfiedLinkError: no jnijavacpp in java.library.path
2021-02-05 11:55:41.403472: I external/org_tensorflow/tensorflow/core/platform/cpu_feature_guard.cc:142] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN)to use the following CPU instructions in performance-critical operations:  AVX2 FMA
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
[WARN ] - More than one deep learning engines found.
[PyTorch, OnnxRuntime, TensorFlow]
Engine name before model load: PyTorch
Warning: Could not load PointerPointer: java.lang.UnsatisfiedLinkError: no jnijavacpp in java.library.path
2021-02-05 11:55:41.469745: I external/org_tensorflow/tensorflow/cc/saved_model/reader.cc:31] Reading SavedModel from: /Users/lawei/Downloads/tensorflow_model
2021-02-05 11:55:41.473848: I external/org_tensorflow/tensorflow/cc/saved_model/reader.cc:54] Reading meta graph with tags { serve }
2021-02-05 11:55:41.473871: I external/org_tensorflow/tensorflow/cc/saved_model/loader.cc:250] Reading SavedModel debug info (if present) from: /Users/lawei/Downloads/tensorflow_model
2021-02-05 11:55:41.491888: I external/org_tensorflow/tensorflow/cc/saved_model/loader.cc:215] Restoring SavedModel bundle.
2021-02-05 11:55:41.605693: I external/org_tensorflow/tensorflow/cc/saved_model/loader.cc:199] Running initialization op on SavedModel bundle at path: /Users/lawei/Downloads/tensorflow_model
2021-02-05 11:55:41.623202: I external/org_tensorflow/tensorflow/cc/saved_model/loader.cc:319] SavedModel load for tags { serve }; Status: success: OK. Took 153456 microseconds.
Engine name after model load: PyTorch
Input: [(-1, 40, 40, 1)]
Output: [(-1, 2)]
Engine name before prediciton: PyTorch
NDArray shape: (1, 40, 40, 1)
NDArray shape: (1, 40, 40, 1)
NDArray size: 1600
Warning: Could not load IntPointer: java.lang.UnsatisfiedLinkError: no jnijavacpp in java.library.path
Predict NDList size: 1
0 dense_2: (1, 2) float32
macster110 commented 3 years ago

Thanks @roywei. I am using an Intel Mac but I also tried this on Windows a little earlier (not with the latest code) and got the same error. Making me think that this could be something to do with the way Maven is setting up it's dependencies or something? I'll maybe try to run this in a new project with a minimal POM just to check.

I also see in your output that PyTorch is getting printed out as the engine - so if the code works for you then this isn't necessarily an issue?

roywei commented 3 years ago

That could also be the reason, I was using DJL's example folder to test your code, which use gradle

Regarding the Engines, DJL is designed to allow multiple engines work together. If you have multiple engine as dependencies, there will be a default engine automatically chosen if you. In 0.9.0 it's not deterministic, in latest master code it's deterministic after https://github.com/awslabs/djl/pull/603. You can override auto engine selection to specific one by adding -Dai.djl.default_engine=XXX. Even with a default engine set, you can still load models in other engines by providing engine name during loading. Model model = Model.newInstance(modelPath, "TensorFlow"); In this case, the specifc model will use TF engine, and normal NDArray creation and operations outside of the model scope will use PyTorch engine(default engine). The translator context is align with the model's engine not default engine.

It will print TensorFlow if you add the argument -Dai.djl.default_engine=TensorFlow

macster110 commented 3 years ago

Thanks @roywei . I just made a new project with exactly the same code and a minimal POM and it worked! So this means that there is a dependency in the larger project which is stopping stuff working. I'm going to do some trial and error work and figure out which one. Will post here once I figure out some more.

macster110 commented 3 years ago

Hi @roywei and all. I have found the issue - if you include a module-info file then this produces the ReadOnlyBuffer Exception . If not, then the error disappears. I have no idea why but as far as I can tell "api" is generated from the djl library and that is an seriously unstable name to use in a module-info file.

Thanks for all your help - I would not have thought to try this without everyone's suggestions and tests.

module jdl4pam {
    exports org.jamdev.jdl4pam.dlpam;
    exports org.jamdev.jdl4pam;
    exports org.jamdev.jdl4pam.pytorch2Java;
    exports org.jamdev.jdl4pam.utils;
    exports org.jamdev.jdl4pam.transforms.jsonfile;
    exports org.jamdev.jdl4pam.SoundSpot;
    exports org.jamdev.jdl4pam.transforms;
    exports org.jamdev.jdl4pam.genericmodel;

    requires api;
    requires java.desktop;
    requires jpamutils;
    requires org.json;
    requires us.hebi.matlab.mat.mfl.core;
}
frankfliu commented 3 years ago

I guess javacpp is using some reflection that is blocked by module. We can try to dig more.

DJL currently compiled target to java 8, it's not be modularized for java 9. At runtime, it will be auto modularized by your JVM at runtime, and it's using the jar file's name as module name. That's why you see api as the module.

We have not yet decided to move to java9+ yet. If you want a stable module name for DJL, you need to rename the jar file to something like: ai.djl.jar

frankfliu commented 3 years ago

@macster110 I created a PR to address module name issue, now each jar has a unique automatic module name: https://github.com/awslabs/djl/pull/627

macster110 commented 3 years ago

Great, thanks again for all the help - I will now close the issue.