LEEay commented 3 weeks ago

Caused by: java.lang.UnsatisfiedLinkError: C:\Users\liwan.javacpp\cache\tensorflow-core-native-1.0.0-rc.1-windows-x86_64.jar\org\tensorflow\internal\c_api\windows-x86_64\jnitensorflow.dll: Can't find dependent libraries java 17

saudet commented 3 weeks ago

Please follow the instructions at https://github.com/bytedeco/javacpp-presets/wiki/Debugging-UnsatisfiedLinkError-on-Windows

LEEay commented 3 weeks ago

根据工具解决了dll缺失问题，但是运行还是报错，ptions.TFFailedPreconditionException: Could not find variable batch_normalization_170/moving_mean. This could mean that the variable has been deleted. In TF1, it can also mean the variable is uninitialized. Debug info: container=localhost, status error message=Resource localhost/batch_normalization_170/moving_mean/class tensorflow::Var does not exist. 在python代码里可以正常推理，模型使用的是：https://github.com/KichangKim/DeepDanbooru，这个模型的.h5 和 .pb 在python里都能正常推理，但是tfjava 就是报错

LEEay commented 3 weeks ago

https://github.com/KichangKim/DeepDanbooru

Craigacp commented 3 weeks ago

TF-Java doesn't support h5 files. If you load the model in in TF Python then save it as a TF SavedModel then it should work.

LEEay commented 3 weeks ago

我是转成了.pb模型， `import tensorflow as tf import tf2onnx loaded_model = tf.keras.models.load_model(r"E:\Download\model-resnet_custom_v3.h5", compile=False)

保存模型为.pb

tf.saved_model.save(loaded_model, r"E:\Download\mymodel\deepdanbooru-v3-20211112")

加载已保存的模型

loaded_model = tf.saved_model.load(r"E:\Download\mymodel\deepdanbooru-v3-20211112\saved_model.pb")`

使用的是https://github.com/KichangKim/DeepDanbooru/tags 里面的 v3-20211112-sgd-e28

转化后在tfjava里运行报错 Session session = SavedModelBundle.load("E:\\Download\\mymodel\\deepdanbooru-v3-20211112", "serve").session(); FloatNdArray matrix3d = NdArrays.ofFloats(org.tensorflow.ndarray.Shape.of(1, 512, 512, 3)); TFloat32 rank3Tensor = TFloat32.tensorOf(matrix3d); System.out.println(rank3Tensor.toString()); Tensor resultTensor = session.runner() .feed("serving_default_inputs:0", rank3Tensor) .fetch("StatefulPartitionedCall:0") .run().get(0); System.out.println(resultTensor.shape()); session.close();

Craigacp commented 3 weeks ago

It has the same error as your earlier comment?

LEEay commented 3 weeks ago

是的 2024-06-14 11:31:35.282952: I tensorflow/core/util/port.cc:113] oneDNN custom operations are on. You may see slightly different numerical results due to floating-point round-off errors from different computation orders. To turn them off, set the environment variable TF_ENABLE_ONEDNN_OPTS=0. 2024-06-14 11:31:35.477578: I tensorflow/cc/saved_model/reader.cc:83] Reading SavedModel from: E:\Download\mymodel\deepdanbooru-v3-20211112 2024-06-14 11:31:35.533223: I tensorflow/cc/saved_model/reader.cc:51] Reading meta graph with tags { serve } 2024-06-14 11:31:35.533251: I tensorflow/cc/saved_model/reader.cc:146] Reading SavedModel debug info (if present) from: E:\Download\mymodel\deepdanbooru-v3-20211112 2024-06-14 11:31:35.533351: I tensorflow/core/platform/cpu_feature_guard.cc:210] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations. To enable the following instructions: SSE SSE2 SSE3 SSE4.1 SSE4.2 AVX2 AVX_VNNI FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags. 2024-06-14 11:31:36.092973: I tensorflow/compiler/mlir/mlir_graph_optimization_pass.cc:388] MLIR V1 optimization pass is not enabled 2024-06-14 11:31:36.136289: I tensorflow/cc/saved_model/loader.cc:234] Restoring SavedModel bundle. 2024-06-14 11:31:37.619728: I tensorflow/cc/saved_model/loader.cc:218] Running initialization op on SavedModel bundle at path: E:\Download\mymodel\deepdanbooru-v3-20211112 2024-06-14 11:31:38.008834: I tensorflow/cc/saved_model/loader.cc:317] SavedModel load for tags { serve }; Status: success: OK. Took 2531231 microseconds. DenseTFloat32(shape=[1, 512, 512, 3]) 2024-06-14 11:31:40.720244: W tensorflow/core/framework/local_rendezvous.cc:404] Local rendezvous is aborting with status: FAILED_PRECONDITION: Could not find variable batch_normalization_89/beta. This could mean that the variable has been deleted. In TF1, it can also mean the variable is uninitialized. Debug info: container=localhost, status error message=Resource localhost/batch_normalization_89/beta/class tensorflow::Var does not exist. [[{{function_node __inference_serving_default_11824}}{{node resnet_custom_v3_1/batch_normalization_89_1/Cast_3/ReadVariableOp}}]] Exception in thread "main" org.tensorflow.exceptions.TFFailedPreconditionException: Could not find variable batch_normalization_89/beta. This could mean that the variable has been deleted. In TF1, it can also mean the variable is uninitialized. Debug info: container=localhost, status error message=Resource localhost/batch_normalization_89/beta/class tensorflow::Var does not exist. [[{{function_node __inference_serving_default_11824}}{{node resnet_custom_v3_1/batch_normalization_89_1/Cast_3/ReadVariableOp}}]] at org.tensorflow.internal.c_api.AbstractTF_Status.throwExceptionIfNotOK(AbstractTF_Status.java:84) at org.tensorflow.Session.run(Session.java:835) at org.tensorflow.Session$Runner.runHelper(Session.java:558) at org.tensorflow.Session$Runner.run(Session.java:485) at com.drsle3.douzhanpro.qingfeng.common.PredictNN.main(PredictNN.java:26)

Craigacp commented 3 weeks ago

Batch norm can be a pain in Keras, can you try setting the model so it's not trainable before saving out the SavedModel in Python? I think it's failing to find a variable that should be fixed after training, but it might have saved the training version of the model. Alternatively, it looks like it might also behave a little differently saved on GPU vs CPU, so you could try loading it directly into a GPU with the config options.

LEEay commented 3 weeks ago

我使用官方方法把模型转成onnx后，用onnx代码，可以推理并获得结果 onnx代码

try {
            // 初始化 ONNX Runtime 环境
            OrtEnvironment env = OrtEnvironment.getEnvironment();

            // 加载 ONNX 模型
            String modelPath = "E:\\Download\\mymodel\\onnxmodel1.onnx";
            OrtSession.SessionOptions sessionOptions = new OrtSession.SessionOptions();
            OrtSession session = env.createSession(modelPath, sessionOptions);

            // 创建 TensorFlow FloatNdArray
            FloatNdArray matrix3d = NdArrays.ofFloats(Shape.of(1, 512, 512, 3));
            // 填充数据到 FloatNdArray（此处仅为示例，实际使用时应填充实际数据）
//            for (int i = 0; i < 512 * 512 * 3; i++) {
//                matrix3d.setFloat(1.0f, 0, i / (512 * 512), (i % (512 * 512)) / 512, i % 512);
//            }

            // 将 FloatNdArray 转换为 FloatBuffer
            FloatBuffer floatBuffer = FloatBuffer.allocate((int) matrix3d.size());
            matrix3d.scalars().forEachIndexed((coords, scalar) -> floatBuffer.put(scalar.getFloat()));
            floatBuffer.flip();

            // 创建 ONNX Tensor
            long[] inputShape = new long[]{1, 512, 512, 3};
            OnnxTensor inputTensor = OnnxTensor.createTensor(env, floatBuffer, inputShape);

            // 构建输入映射
            Map<String, OnnxTensor> inputs = new HashMap<>();
            inputs.put("inputs", inputTensor); // 输入节点名称需要与你的模型匹配

            // 运行推理
            OrtSession.Result results = session.run(inputs);

            // 获取输出 (根据你的模型调整输出节点名称)
            OnnxValue outputValue = results.get(0);
            float[] outputData = ((OnnxTensor) outputValue).getFloatBuffer().array();

            // 打印输出
            System.out.println("Model output:");
            for (float value : outputData) {
                System.out.println(value);
            }

            // 释放资源
            inputTensor.close();
            results.close();
            session.close();
            env.close();
        } catch (OrtException e) {
            e.printStackTrace();
        }

Craigacp commented 3 weeks ago

So it didn't change if you exported it after setting it into eval mode, but tf2onnx made a ONNX model which worked?

LEEay commented 3 weeks ago

https://github.com/tensorflow/java/issues/543#issuecomment-2167108575 转成pb格式后 python -m tf2onnx.convert --saved-model mymodel/deepdanbooru-v3-20211112/ --output mymodel/onnxmodel1.onnx --opset 9 使用命令转成onnx，然后用onnx的java代码简单推理可以正常使用

tensorflow / java

Caused by: java.lang.UnsatisfiedLinkError: C:\Users\liwan\.javacpp\cache\tensorflow-core-native-1.0.0-rc.1-windows-x86_64.jar\org\tensorflow\internal\c_api\windows-x86_64\jnitensorflow.dll: Can't find dependent libraries #543

保存模型为.pb

加载已保存的模型