microsoft / SynapseML

Simple and Distributed Machine Learning
http://aka.ms/spark
MIT License
5.07k stars 831 forks source link

[BUG]'java.lang.NoClassDefFoundError: ai/onnxruntime/NodeInfo' #1682

Closed julia-pp closed 1 year ago

julia-pp commented 2 years ago

SynapseML version

0.10.1

System information

Describe the problem

When I want to use onnx inference on spark, it throws a 'java.lang.NoClassDefFoundError: ai/onnxruntime/NodeInfo' error. I clone the raw source and don't find 'NodeInfo' definition in the project which used in https://github.com/microsoft/SynapseML/blob/master/deep-learning/src/main/scala/com/microsoft/azure/synapse/ml/onnx/ONNXModel.scala

Code to reproduce issue

        spark = pyspark.sql.SparkSession.builder.appName("MyApp") \
        .config("spark.jars.packages", "com.microsoft.azure:synapseml_2.12:0.10.1") \
        .config("spark.jars.repositories", "https://mmlspark.azureedge.net/maven") \
        .getOrCreate()

        onnx_model = onnx.load(os.getcwd()+ '/data/lgb_single.onnx')

        from synapse.ml.onnx import ONNXModel

        onnx_ml = ONNXModel().setModelPayload(onnx_model.SerializeToString())

        # print("Model inputs:" + str(onnx_ml.getModelInputs()))
        # print("Model outputs:" + str(onnx_ml.getModelOutputs()))

        onnx_ml = (
            onnx_ml.setDeviceType("CPU")
                .setFeedDict({"input": "features"})
                .setFetchDict({"probability": "probabilities", "prediction": "label"})
                .setMiniBatchSize(1000)
        )

Other info / logs

Exception in thread "Thread-4" java.lang.NoClassDefFoundError: ai/onnxruntime/NodeInfo
    at java.lang.Class.getDeclaredMethods0(Native Method)
    at java.lang.Class.privateGetDeclaredMethods(Class.java:2701)
    at java.lang.Class.privateGetPublicMethods(Class.java:2902)
    at java.lang.Class.getMethods(Class.java:1615)
    at py4j.reflection.ReflectionEngine.getMethodsByNameAndLength(ReflectionEngine.java:345)
    at py4j.reflection.ReflectionEngine.getMethod(ReflectionEngine.java:305)
    at py4j.reflection.ReflectionEngine.getMethod(ReflectionEngine.java:326)
    at py4j.Gateway.invoke(Gateway.java:274)
    at py4j.commands.AbstractCommand.invokeMethod(AbstractCommand.java:132)
    at py4j.commands.CallCommand.execute(CallCommand.java:79)
    at py4j.GatewayConnection.run(GatewayConnection.java:238)
    at java.lang.Thread.run(Thread.java:748)
Caused by: java.lang.ClassNotFoundException: ai.onnxruntime.NodeInfo
    at java.net.URLClassLoader.findClass(URLClassLoader.java:387)
    at java.lang.ClassLoader.loadClass(ClassLoader.java:419)
    at java.lang.ClassLoader.loadClass(ClassLoader.java:352)
    ... 12 more

What component(s) does this bug affect?

What language(s) does this bug affect?

What integration(s) does this bug affect?

github-actions[bot] commented 2 years ago

Hey @julia-pp :wave:! Thank you so much for reporting the issue/feature request :rotating_light:. Someone from SynapseML Team will be looking to triage this issue soon. We appreciate your patience.

memoryz commented 2 years ago

Hi @julia-pp, this issue is because we depend on this library: "com.microsoft.onnxruntime" % "onnxruntime_gpu" % "1.8.1", and it is not supported on MacOS. To work around this problem, you can fork the code base, and swap out the dependency with CPU flavor: https://github.com/microsoft/SynapseML/blob/f29318a274610dda543ee1422bdbd74cdb6a752a/build.sbt#L406 replace it with "com.microsoft.onnxruntime" % "onnxruntime" % "1.8.1".

To build the source code, you can follow the instructions here: https://github.com/microsoft/SynapseML/blob/master/website/docs/reference/developer-readme.md