intel-analytics / ipex-llm

Accelerate local LLM inference and finetuning (LLaMA, Mistral, ChatGLM, Qwen, Mixtral, Gemma, Phi, MiniCPM, Qwen-VL, MiniCPM-V, etc.) on Intel XPU (e.g., local PC with iGPU and NPU, discrete GPU such as Arc, Flex and Max); seamlessly integrate with llama.cpp, Ollama, HuggingFace, LangChain, LlamaIndex, vLLM, GraphRAG, DeepSpeed, Axolotl, etc
Apache License 2.0
6.69k stars 1.26k forks source link

bigdl 0.11 is not compatible with Zoo #3032

Closed Le-Zheng closed 4 years ago

Le-Zheng commented 4 years ago

error 1: when Zoo utils use model.summary(), it occurs scala.MatchError: null at com.intel.analytics.zoo.pipeline.api.keras.layers.utils.KerasUtils$.countParams(KerasUtils.scala:337)

../test/zoo/models/anomalydetection/test_anomalydetector.py:31: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
../zoo/models/common/zoo_model.py:80: in summary
    self.value)
../zoo/common/utils.py:133: in callZooFunc
    raise e
../zoo/common/utils.py:127: in callZooFunc
    java_result = api(*args)
E                   py4j.protocol.Py4JJavaError: An error occurred while calling o47.zooModelSummary.
E                   : scala.MatchError: null
E                       at com.intel.analytics.zoo.pipeline.api.keras.layers.utils.KerasUtils$.countParams(KerasUtils.scala:337)
E                       at com.intel.analytics.zoo.pipeline.api.keras.layers.utils.KerasUtils$.getLayerSummary(KerasUtils.scala:369)
E                       at com.intel.analytics.zoo.pipeline.api.keras.layers.utils.KerasUtils$.getNodeSummary(KerasUtils.scala:379)
E                       at com.intel.analytics.zoo.pipeline.api.keras.layers.utils.KerasUtils$.printNodeSummary(KerasUtils.scala:397)
E                       at com.intel.analytics.zoo.pipeline.api.keras.models.Model$$anonfun$summary$1.apply(Topology.scala:698)
E                       at com.intel.analytics.zoo.pipeline.api.keras.models.Model$$anonfun$summary$1.apply(Topology.scala:697)
E                       at scala.collection.IndexedSeqOptimized$class.foreach(IndexedSeqOptimized.scala:33)
E                       at scala.collection.mutable.ArrayOps$ofRef.foreach(ArrayOps.scala:186)
E                       at com.intel.analytics.zoo.pipeline.api.keras.models.Model.summary(Topology.scala:697)
E                       at com.intel.analytics.zoo.pipeline.api.keras.models.Sequential.summary(Topology.scala:936)
E                       at com.intel.analytics.zoo.models.common.ZooModel.summary(ZooModel.scala:89)
E                       at com.intel.analytics.zoo.models.python.PythonZooModel.zooModelSummary(PythonZooModel.scala:386)
E                       at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
E                       at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
E                       at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
E                       at java.lang.reflect.Method.invoke(Method.java:498)

error 2: tensorboard fileWriter error. It may need to add GraphDef.toByteString()

Traceback (most recent call last):
  File "/opt/work/jenkins/workspace/ZOO-PR-Python-ExampleTests-py37/pyzoo/zoo/examples/autograd/custom.py", line 61, in <module>
    model.save_graph_topology('./log')
  File "/opt/work/jenkins/workspace/ZOO-PR-Python-ExampleTests-py37/dist/lib/analytics-zoo-bigdl_0.11.0-spark_2.4.3-0.9.0-SNAPSHOT-python-api.zip/zoo/pipeline/api/keras/models.py", line 105, in save_graph_topology
  File "/opt/work/jenkins/workspace/ZOO-PR-Python-ExampleTests-py37/dist/lib/analytics-zoo-bigdl_0.11.0-spark_2.4.3-0.9.0-SNAPSHOT-python-api.zip/zoo/common/utils.py", line 133, in callZooFunc
  File "/opt/work/jenkins/workspace/ZOO-PR-Python-ExampleTests-py37/dist/lib/analytics-zoo-bigdl_0.11.0-spark_2.4.3-0.9.0-SNAPSHOT-python-api.zip/zoo/common/utils.py", line 127, in callZooFunc
  File "/opt/work/spark-2.4.3/python/lib/py4j-0.10.7-src.zip/py4j/java_gateway.py", line 1257, in __call__
  File "/opt/work/spark-2.4.3/python/lib/py4j-0.10.7-src.zip/py4j/protocol.py", line 328, in get_return_value
py4j.protocol.Py4JJavaError: An error occurred while calling o48.zooSaveGraphTopology.
: java.lang.NoSuchMethodError: org.tensorflow.framework.GraphDef.toByteString()Lcom/intel/analytics/bigdl/shaded/protobuf/ByteString;
    at com.intel.analytics.bigdl.visualization.tensorboard.FileWriter.addGraphDef(FileWriter.scala:56)
    at com.intel.analytics.bigdl.nn.Graph.saveGraphTopology(Graph.scala:456)
    at com.intel.analytics.zoo.pipeline.api.keras.models.Model.saveGraphTopology(Topology.scala:644)
    at com.intel.analytics.zoo.pipeline.api.keras.python.PythonZooKeras.zooSaveGraphTopology(PythonZooKeras.scala:225)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:498)
    at py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:244)
    at py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:357)
    at py4j.Gateway.invoke(Gateway.java:282)
    at py4j.commands.AbstractCommand.invokeMethod(AbstractCommand.java:132)
    at py4j.commands.CallCommand.execute(CallCommand.java:79)
    at py4j.GatewayConnection.run(GatewayConnection.java:238)
    at java.lang.Thread.run(Thread.java:748)
yangw1234 commented 4 years ago

Hi @jason-dai , I think the second error is because bigdl and analytics-zoo both have tensorflow-java dependencies (which depends on protobuf), but analytics-zoo and bigdl shade protobuf to different names.

Do you think it is ok if we shade protobuf to a common name, e.g. "com.intel.analytics.protobuf_v3_5_1"?

jason-dai commented 4 years ago

I do remember that BigDL has several issues related to TensorFlow shading @Litchilitchy?

Litchilitchy commented 4 years ago

I have checked pom file and seems the shading of org.tensorflow is removed now thus I think it should work currently. @jason-dai