JohnSnowLabs / nlu

1 line for thousands of State of The Art NLP models in hundreds of languages The fastest and most accurate way to solve text problems.
Apache License 2.0
864 stars 130 forks source link

load error #91

Open filemon11 opened 2 years ago

filemon11 commented 2 years ago

I get the following error when trying the following:

import nlu
nlu.load('elmo')

using configuration: OS: Windows 10 Java version: 1.8.0_311 (Java 8) Pyspark – version: 3.1.2

:: loading settings :: url = jar:file:/C:/Spark/spark-3.2.0-bin-hadoop3.2/jars/ivy-2.5.0.jar!/org/apache/ivy/core/settings/ivysettings.xml Ivy Default Cache set to: C:\Users\Lukas.ivy2\cache The jars for the packages stored in: C:\Users\Lukas.ivy2\jars com.johnsnowlabs.nlp#spark-nlp_2.12 added as a dependency :: resolving dependencies :: org.apache.spark#spark-submit-parent-f9a2f2a7-e7ac-44f5-a922-ae1493621cbc;1.0 confs: [default] found com.johnsnowlabs.nlp#spark-nlp_2.12;3.3.4 in central found com.typesafe#config;1.4.1 in central found org.rocksdb#rocksdbjni;6.5.3 in central found com.amazonaws#aws-java-sdk-bundle;1.11.603 in central found com.github.universal-automata#liblevenshtein;3.0.0 in central found com.google.code.findbugs#annotations;3.0.1 in central found net.jcip#jcip-annotations;1.0 in central found com.google.code.findbugs#jsr305;3.0.1 in central found com.google.protobuf#protobuf-java-util;3.0.0-beta-3 in central found com.google.protobuf#protobuf-java;3.0.0-beta-3 in central found com.google.code.gson#gson;2.3 in central found it.unimi.dsi#fastutil;7.0.12 in central found org.projectlombok#lombok;1.16.8 in central found org.slf4j#slf4j-api;1.7.21 in central found com.navigamez#greex;1.0 in central found dk.brics.automaton#automaton;1.11-8 in central found org.json4s#json4s-ext_2.12;3.5.3 in central found joda-time#joda-time;2.9.5 in central found org.joda#joda-convert;1.8.1 in central found com.johnsnowlabs.nlp#tensorflow-cpu_2.12;0.3.3 in central found net.sf.trove4j#trove4j;3.0.3 in central :: resolution report :: resolve 391ms :: artifacts dl 16ms :: modules in use: com.amazonaws#aws-java-sdk-bundle;1.11.603 from central in [default] com.github.universal-automata#liblevenshtein;3.0.0 from central in [default] com.google.code.findbugs#annotations;3.0.1 from central in [default] com.google.code.findbugs#jsr305;3.0.1 from central in [default] com.google.code.gson#gson;2.3 from central in [default] com.google.protobuf#protobuf-java;3.0.0-beta-3 from central in [default] com.google.protobuf#protobuf-java-util;3.0.0-beta-3 from central in [default] com.johnsnowlabs.nlp#spark-nlp_2.12;3.3.4 from central in [default] com.johnsnowlabs.nlp#tensorflow-cpu_2.12;0.3.3 from central in [default] com.navigamez#greex;1.0 from central in [default] com.typesafe#config;1.4.1 from central in [default] dk.brics.automaton#automaton;1.11-8 from central in [default] it.unimi.dsi#fastutil;7.0.12 from central in [default] joda-time#joda-time;2.9.5 from central in [default] net.jcip#jcip-annotations;1.0 from central in [default] net.sf.trove4j#trove4j;3.0.3 from central in [default] org.joda#joda-convert;1.8.1 from central in [default] org.json4s#json4s-ext_2.12;3.5.3 from central in [default] org.projectlombok#lombok;1.16.8 from central in [default] org.rocksdb#rocksdbjni;6.5.3 from central in [default] org.slf4j#slf4j-api;1.7.21 from central in [default]

| | modules || artifacts | | conf | number| search|dwnlded|evicted|| number|dwnlded|

| default | 21 | 0 | 0 | 0 || 21 | 0 |

:: retrieving :: org.apache.spark#spark-submit-parent-f9a2f2a7-e7ac-44f5-a922-ae1493621cbc confs: [default] 0 artifacts copied, 21 already retrieved (0kB/0ms) 22/01/14 17:30:48 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable Using Spark's default log4j profile: org/apache/spark/log4j-defaults.properties Setting default log level to "WARN". To adjust logging level use sc.setLogLevel(newLevel). For SparkR, use setLogLevel(newLevel). elmo download started this may take some time. 22/01/14 17:31:05 WARN ProcfsMetricsGetter: Exception when trying to compute pagesize, as a result reporting of ProcessTree metrics is stopped EXCEPTION: Could not resolve singular Component for type=elmo and nlp_ref=elmo and nlu_ref=elmo and lang =en Traceback (most recent call last): File "D:.venv\python3.8_nlu\lib\site-packages\nlu\pipe\component_resolution.py", line 708, in construct_component_from_identifier return Embeddings(get_default=False, nlp_ref=nlp_ref, nlu_ref=nlu_ref, lang=language, File "D:.venv\python3.8_nlu\lib\site-packages\nlu\components\embedding.py", line 98, in init else : self.model =SparkNLPElmo.get_pretrained_model(nlp_ref, lang) File "D:.venv\python3.8_nlu\lib\site-packages\nlu\components\embeddings\elmo\spark_nlp_elmo.py", line 14, in get_pretrained_model return ElmoEmbeddings.pretrained(name,language) \ File "D:.venv\python3.8_nlu\lib\site-packages\sparknlp\annotator.py", line 7760, in pretrained return ResourceDownloader.downloadModel(ElmoEmbeddings, name, lang, remote_loc) File "D:.venv\python3.8_nlu\lib\site-packages\sparknlp\pretrained.py", line 50, in downloadModel file_size = _internal._GetResourceSize(name, language, remote_loc).apply() File "D:.venv\python3.8_nlu\lib\site-packages\sparknlp\internal.py", line 231, in init super(_GetResourceSize, self).init( File "D:.venv\python3.8_nlu\lib\site-packages\sparknlp\internal.py", line 165, in init self._java_obj = self.new_java_obj(java_obj, args) File "D:.venv\python3.8_nlu\lib\site-packages\sparknlp\internal.py", line 175, in new_java_obj return self._new_java_obj(java_class, args) File "D:.venv\python3.8_nlu\lib\site-packages\pyspark\ml\wrapper.py", line 66, in _new_java_obj return java_obj(java_args) File "D:.venv\python3.8_nlu\lib\site-packages\py4j\java_gateway.py", line 1304, in call return_value = get_return_value( File "D:.venv\python3.8_nlu\lib\site-packages\pyspark\sql\utils.py", line 111, in deco return f(a, **kw) File "D:.venv\python3.8_nlu\lib\site-packages\py4j\protocol.py", line 326, in get_return_value raise Py4JJavaError( py4j.protocol.Py4JJavaError: An error occurred while calling z:com.johnsnowlabs.nlp.pretrained.PythonResourceDownloader.getDownloadSize. : java.lang.NoClassDefFoundError: org/json4s/package$MappingException at org.json4s.ext.EnumNameSerializer.deserialize(EnumSerializer.scala:53) at org.json4s.Formats$$anonfun$customDeserializer$1.applyOrElse(Formats.scala:66) at org.json4s.Formats$$anonfun$customDeserializer$1.applyOrElse(Formats.scala:66) at scala.collection.TraversableOnce.collectFirst(TraversableOnce.scala:180) at scala.collection.TraversableOnce.collectFirst$(TraversableOnce.scala:167) at scala.collection.AbstractTraversable.collectFirst(Traversable.scala:108) at org.json4s.Formats$.customDeserializer(Formats.scala:66) at org.json4s.Extraction$.customOrElse(Extraction.scala:775) at org.json4s.Extraction$.extract(Extraction.scala:454) at org.json4s.Extraction$.extract(Extraction.scala:56) at org.json4s.ExtractableJsonAstNode.extract(ExtractableJsonAstNode.scala:22) at com.johnsnowlabs.util.JsonParser$.parseObject(JsonParser.scala:28) at com.johnsnowlabs.nlp.pretrained.ResourceMetadata$.parseJson(ResourceMetadata.scala:101) at com.johnsnowlabs.nlp.pretrained.ResourceMetadata$$anonfun$readResources$1.applyOrElse(ResourceMetadata.scala:129) at com.johnsnowlabs.nlp.pretrained.ResourceMetadata$$anonfun$readResources$1.applyOrElse(ResourceMetadata.scala:128) at scala.runtime.AbstractPartialFunction.apply(AbstractPartialFunction.scala:38) at scala.collection.Iterator$$anon$13.next(Iterator.scala:593) at scala.collection.Iterator.foreach(Iterator.scala:943) at scala.collection.Iterator.foreach$(Iterator.scala:943) at scala.collection.AbstractIterator.foreach(Iterator.scala:1431) at scala.collection.generic.Growable.$plus$plus$eq(Growable.scala:62) at scala.collection.generic.Growable.$plus$plus$eq$(Growable.scala:53) at scala.collection.mutable.ListBuffer.$plus$plus$eq(ListBuffer.scala:184) at scala.collection.mutable.ListBuffer.$plus$plus$eq(ListBuffer.scala:47) at scala.collection.TraversableOnce.to(TraversableOnce.scala:366) at scala.collection.TraversableOnce.to$(TraversableOnce.scala:364) at scala.collection.AbstractIterator.to(Iterator.scala:1431) at scala.collection.TraversableOnce.toList(TraversableOnce.scala:350) at scala.collection.TraversableOnce.toList$(TraversableOnce.scala:350) at scala.collection.AbstractIterator.toList(Iterator.scala:1431) at com.johnsnowlabs.nlp.pretrained.ResourceMetadata$.readResources(ResourceMetadata.scala:128) at com.johnsnowlabs.nlp.pretrained.ResourceMetadata$.readResources(ResourceMetadata.scala:123) at com.johnsnowlabs.client.aws.AWSGateway.getMetadata(AWSGateway.scala:78) at com.johnsnowlabs.nlp.pretrained.S3ResourceDownloader.downloadMetadataIfNeed(S3ResourceDownloader.scala:62) at com.johnsnowlabs.nlp.pretrained.S3ResourceDownloader.resolveLink(S3ResourceDownloader.scala:68) at com.johnsnowlabs.nlp.pretrained.S3ResourceDownloader.getDownloadSize(S3ResourceDownloader.scala:145) at com.johnsnowlabs.nlp.pretrained.ResourceDownloader$.getDownloadSize(ResourceDownloader.scala:445) at com.johnsnowlabs.nlp.pretrained.PythonResourceDownloader$.getDownloadSize(ResourceDownloader.scala:577) at com.johnsnowlabs.nlp.pretrained.PythonResourceDownloader.getDownloadSize(ResourceDownloader.scala) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:244) at py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:357) at py4j.Gateway.invoke(Gateway.java:282) at py4j.commands.AbstractCommand.invokeMethod(AbstractCommand.java:132) at py4j.commands.CallCommand.execute(CallCommand.java:79) at py4j.ClientServerConnection.waitForCommands(ClientServerConnection.java:182) at py4j.ClientServerConnection.run(ClientServerConnection.java:106) at java.lang.Thread.run(Thread.java:748) Caused by: java.lang.ClassNotFoundException: org.json4s.package$MappingException at java.net.URLClassLoader.findClass(URLClassLoader.java:387) at java.lang.ClassLoader.loadClass(ClassLoader.java:418) at java.lang.ClassLoader.loadClass(ClassLoader.java:351) ... 51 more

Traceback (most recent call last): File "D:.venv\python3.8_nlu\lib\site-packages\nlu__init__.py", line 236, in load nlu_component = nlu_ref_to_component(nlu_ref, authenticated=is_authenticated) File "D:.venv\python3.8_nlu\lib\site-packages\nlu\pipe\component_resolution.py", line 171, in nlu_ref_to_component resolved_component = resolve_component_from_parsed_query_data(language, component_type, dataset, File "D:.venv\python3.8_nlu\lib\site-packages\nlu\pipe\component_resolution.py", line 320, in resolve_component_from_parsed_query_data raise ValueError(f'EXCEPTION : Could not create NLU component for nlp_ref={nlp_ref} and nlu_ref={nlu_ref}') ValueError: EXCEPTION : Could not create NLU component for nlp_ref=elmo and nlu_ref=elmo

During handling of the above exception, another exception occurred:

Traceback (most recent call last): File "", line 1, in File "D:.venv\python3.8_nlu\lib\site-packages\nlu__init__.py", line 255, in load raise Exception( Exception: Something went wrong during loading and fitting the pipe. Check the other prints for more information and also verbose mode. Did you use a correct model reference?

C-K-Loan commented 2 years ago

Hi @filemon11 this looks like Spark is not properly setup on Windows, can you make sure you follow all steps involved to install spark-nlp and pyspark for windows?

The windows setup can be a bit tricky, but if you are just getting started we recommend to use google colab, which provided you instantly with a working environment in your browser

https://colab.research.google.com/drive/1j4Ek0JkBPmnK75qIxyYjVtYWNUPRbh9v?usp=sharing

hhassanien commented 2 years ago

@C-K-Loan would it be possible to please share some inputs against Windows installation please ?