JohnSnowLabs / spark-nlp

State of the Art Natural Language Processing
https://sparknlp.org/
Apache License 2.0
3.86k stars 711 forks source link

Py4JJavaError - When using MedicalNerModel and SentenceEntityResolverModel #6173

Closed sarathsurpur closed 3 years ago

sarathsurpur commented 3 years ago

Description

I am trying to use the Licensed version of SPARK NLP in health care domain to extract the ICD codes from clinical notes and I was following this notebook https://github.com/JohnSnowLabs/spark-nlp-workshop/blob/master/tutorials/Certification_Trainings/Healthcare/3.Clinical_Entity_Resolvers.ipynb I was able to download all the models except icd_resolver and clinical_ner. But the same model like clinical_ner is working when I try to use PretrainedPipeline('explain_clinical_doc_era', 'en', 'clinical/models').

The error am getting is posted below for your reference

Expected Behavior

Current Behavior

Couldn't download the model clinical ner and ICD 10 model

Steps to Reproduce

Followed this notebook with a licensed version>> https://github.com/JohnSnowLabs/spark-nlp-workshop/blob/master/tutorials/Certification_Trainings/Healthcare/3.Clinical_Entity_Resolvers.ipynb

Context

I am trying to extract ICD code from notes I couldnt create a pipeline

Your Environment

Error Text -

Approximate size to download 367.3 KB [ | ]sentence_detector_dl_healthcare download started this may take some time. Approximate size to download 367.3 KB [ / ]Download done! Loading the resource. [ — ] 2021-09-28 18:22:15.930532: I external/org_tensorflow/tensorflow/core/platform/cpu_feature_guard.cc:142] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations: AVX2 FMA To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags. [OK!] embeddings_clinical download started this may take some time. Approximate size to download 1.6 GB [ / ]embeddings_clinical download started this may take some time. Approximate size to download 1.6 GB Download done! Loading the resource. [OK!] ner_clinical download started this may take some time. Approximate size to download 13.9 MB [ | ] An error occurred while calling z:com.johnsnowlabs.nlp.pretrained.InternalsPythonResourceDownloader.downloadModel. : com.johnsnowlabs.license.exceptions.JslInvalidLicenseException: Wrong symbol in license key: [77aaa362]. at com.johnsnowlabs.license.LicenseValidator$.$anonfun$getLicense$4(LicenseValidator.scala:96) at scala.Option.flatMap(Option.scala:271) at com.johnsnowlabs.license.LicenseValidator$.$anonfun$getLicense$2(LicenseValidator.scala:82) at scala.Option.getOrElse(Option.scala:189) at com.johnsnowlabs.license.LicenseValidator$.getLicense(LicenseValidator.scala:75) at com.johnsnowlabs.license.LicenseValidator$.checkValidLicenseIsPresent(LicenseValidator.scala:164) at com.johnsnowlabs.license.CheckLicense.checkValidEnvironment(CheckLicense.scala:19) at com.johnsnowlabs.license.CheckLicense.checkValidEnvironment$(CheckLicense.scala:18) at com.johnsnowlabs.nlp.pretrained.InternalsPythonResourceDownloader$.checkValidEnvironment(InternalsPythonResourceDownloader.scala:13) at com.johnsnowlabs.nlp.pretrained.InternalsPythonResourceDownloader$.downloadModel(InternalsPythonResourceDownloader.scala:37) at com.johnsnowlabs.nlp.pretrained.InternalsPythonResourceDownloader.downloadModel(InternalsPythonResourceDownloader.scala) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:244) at py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:357) at py4j.Gateway.invoke(Gateway.java:282) at py4j.commands.AbstractCommand.invokeMethod(AbstractCommand.java:132) at py4j.commands.CallCommand.execute(CallCommand.java:79) at py4j.GatewayConnection.run(GatewayConnection.java:238) at java.lang.Thread.run(Thread.java:748) [OK!]

Py4JJavaError Traceback (most recent call last) /var/folders/8n/zt5x88910bx47vs2xyhwl4nw0000gn/T/ipykernel_48112/1081320759.py in 20 21 # Named Entity Recognition for clinical concepts. ---> 22 clinical_ner = MedicalNerModel.pretrained("ner_clinical", "en", "clinical/models") \ 23 .setInputCols(["sentence", "token", "word_embeddings"]) \ 24 .setOutputCol("ner")

~/.pyenv/versions/entity_extraction/lib/python3.7/site-packages/sparknlp_jsl/annotator.py in pretrained(name, lang, remote_loc) 1901 from sparknlp.pretrained import ResourceDownloader 1902 return ResourceDownloader.downloadModel(MedicalNerModel, name, lang, remote_loc, -> 1903 j_dwn='InternalsPythonResourceDownloader') 1904 1905

~/.pyenv/versions/entity_extraction/lib/python3.7/site-packages/sparknlp/pretrained.py in downloadModel(reader, name, language, remote_loc, j_dwn) 60 except Py4JJavaError as e: 61 sys.stdout.write("\n" + str(e)) ---> 62 raise e 63 finally: 64 stop_threads = True

~/.pyenv/versions/entity_extraction/lib/python3.7/site-packages/sparknlp/pretrained.py in downloadModel(reader, name, language, remote_loc, j_dwn) 57 t1.start() 58 try: ---> 59 j_obj = _internal._DownloadModel(reader.name, name, language, remote_loc, j_dwn).apply() 60 except Py4JJavaError as e: 61 sys.stdout.write("\n" + str(e))

~/.pyenv/versions/entity_extraction/lib/python3.7/site-packages/sparknlp/internal.py in init(self, reader, name, language, remote_loc, validator) 212 def init(self, reader, name, language, remote_loc, validator): 213 super(_DownloadModel, self).init("com.johnsnowlabs.nlp.pretrained." + validator + ".downloadModel", reader, --> 214 name, language, remote_loc) 215 216

~/.pyenv/versions/entity_extraction/lib/python3.7/site-packages/sparknlp/internal.py in init(self, java_obj, args) 163 super(ExtendedJavaWrapper, self).init(java_obj) 164 self.sc = SparkContext._active_spark_context --> 165 self._java_obj = self.new_java_obj(java_obj, args) 166 self.java_obj = self._java_obj 167

~/.pyenv/versions/entity_extraction/lib/python3.7/site-packages/sparknlp/internal.py in new_java_obj(self, java_class, args) 173 174 def new_java_obj(self, java_class, args): --> 175 return self._new_java_obj(java_class, *args) 176 177 def new_java_array(self, pylist, java_class):

~/.pyenv/versions/entity_extraction/lib/python3.7/site-packages/pyspark/ml/wrapper.py in _new_java_obj(java_class, args) 64 java_obj = getattr(java_obj, name) 65 java_args = [_py2java(sc, arg) for arg in args] ---> 66 return java_obj(java_args) 67 68 @staticmethod

~/.pyenv/versions/entity_extraction/lib/python3.7/site-packages/py4j/java_gateway.py in call(self, *args) 1303 answer = self.gateway_client.send_command(command) 1304 return_value = get_return_value( -> 1305 answer, self.gateway_client, self.target_id, self.name) 1306 1307 for temp_arg in temp_args:

~/.pyenv/versions/entity_extraction/lib/python3.7/site-packages/pyspark/sql/utils.py in deco(*a, kw) 109 def deco(*a, *kw): 110 try: --> 111 return f(a, kw) 112 except py4j.protocol.Py4JJavaError as e: 113 converted = convert_exception(e.java_exception)

~/.pyenv/versions/entity_extraction/lib/python3.7/site-packages/py4j/protocol.py in get_return_value(answer, gateway_client, target_id, name) 326 raise Py4JJavaError( 327 "An error occurred while calling {0}{1}{2}.\n". --> 328 format(target_id, ".", name), value) 329 else: 330 raise Py4JError(

Py4JJavaError: An error occurred while calling z:com.johnsnowlabs.nlp.pretrained.InternalsPythonResourceDownloader.downloadModel. : com.johnsnowlabs.license.exceptions.JslInvalidLicenseException: Wrong symbol in license key: [77aaa362]. at com.johnsnowlabs.license.LicenseValidator$.$anonfun$getLicense$4(LicenseValidator.scala:96) at scala.Option.flatMap(Option.scala:271) at com.johnsnowlabs.license.LicenseValidator$.$anonfun$getLicense$2(LicenseValidator.scala:82) at scala.Option.getOrElse(Option.scala:189) at com.johnsnowlabs.license.LicenseValidator$.getLicense(LicenseValidator.scala:75) at com.johnsnowlabs.license.LicenseValidator$.checkValidLicenseIsPresent(LicenseValidator.scala:164) at com.johnsnowlabs.license.CheckLicense.checkValidEnvironment(CheckLicense.scala:19) at com.johnsnowlabs.license.CheckLicense.checkValidEnvironment$(CheckLicense.scala:18) at com.johnsnowlabs.nlp.pretrained.InternalsPythonResourceDownloader$.checkValidEnvironment(InternalsPythonResourceDownloader.scala:13) at com.johnsnowlabs.nlp.pretrained.InternalsPythonResourceDownloader$.downloadModel(InternalsPythonResourceDownloader.scala:37) at com.johnsnowlabs.nlp.pretrained.InternalsPythonResourceDownloader.downloadModel(InternalsPythonResourceDownloader.scala) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:244) at py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:357) at py4j.Gateway.invoke(Gateway.java:282) at py4j.commands.AbstractCommand.invokeMethod(AbstractCommand.java:132) at py4j.commands.CallCommand.execute(CallCommand.java:79) at py4j.GatewayConnection.run(GatewayConnection.java:238) at java.lang.Thread.run(Thread.java:748)

maziyarpanahi commented 3 years ago

@sarathsurpur This repo only supports public and open-source features. What you are using is licensed (we have no info), please either contact them directly via email or on Slack you can ask them inside the #healthcare channel.