Open uzairahmadxy opened 2 years ago
I forgot to mention I have a trial Healthcare license.
@uzairahmadxy can you share the full error trace from the notebook and also check your jupyter shell for any errors and share those?
Hi @C-K-Loan. Here's the additional information
Thank you for sharing @uzairahmadxy Looks like something is not correctly setup with your hadoop utils. Make sure to precisely follow every step listed here https://nlp.johnsnowlabs.com/docs/en/install#windows-support This should fix all your issues
Hi @C-K-Loan
I re-installed everything using the instructions. It still throws the error (note: I don't see the Hadoop utils error now in the jupyter kernel though).
Nice that's one less error! @uzairahmadxy can you test running this open source notebook and see if it works or not ?
https://github.com/JohnSnowLabs/spark-nlp-workshop/blob/master/tutorials/Certification_Trainings/Public/1.SparkNLP_Basics.ipynb You can skip the cells with pip install
Also could you copy paste the entire error trace you get here or https://pastebin.com/
Hi @C-K-Loan This is for the healthcare notebook kernel (https://pastebin.com/cV6ymZvR)
Side note: Pyspark works ok (as shown in the screenshot. I thought there was an issue with spark before)
Thank you for sharing @uzairahmadxy
Looks like the jar loaded into you spark session is missing some classes.
But you should have downloaded the fat jar, i.e. the one with all the dependencies when running sparknlp.start()
@uzairahmadxy
Can you try manually downloading the Spark-NLP jar and then start a Spark-Session by passing the path to it?
I.e. Download : https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/jars/spark-nlp-assembly-4.2.2.jar
Then instead of sparknlp.start()
run the following and try continue running the rest of the Notebook 1
spark = SparkSession.builder \
.appName("Spark NLP")\
.master("local[*]")\
.config("spark.driver.memory","16G")\
.config("spark.driver.maxResultSize", "0") \
.config("spark.kryoserializer.buffer.max", "2000M")\
.config("spark.jars", "path/to/the/spark-nlp.jar")\
.getOrCreate()
Maybe this is a Windows Specific bug, I think @josejuanmartinez is on Windows have you maybe seen this?
Hey I am not on Windows anymore sorry
Thanks @C-K-Loan. Manually loading the jar worked for the basic spark nlp.
I guess the same will have to be done for using the healthcare library as well. Can you please share where I can get these from?
Hi @uzairahmadxy, great good to know that this works and sorry for the bug
to get the healthcare jar :
replace secret
with your healthcare Secret and lib_version
and you will have the URL.
https://pypi.johnsnowlabs.com/{secret}/spark-nlp-jsl-{lib_version}.jar
i.e. if the secret is 4.2.1.agdfgdgdl
the url would be
https://pypi.johnsnowlabs.com/4.2.1.agdfgdgdl/spark-nlp-jsl-4.2.1.jar
@Meryem1425 can you see if you run into the same issue on Windows?
Thank you for sharing @C-K-Loan
While the jars are loaded, the problem still persists as I want to load pretrained healthcare models/pipelines.
Error Trace: https://pastebin.com/xtkJKVLk Jupyter Kernel: https://pastebin.com/fznqEBvq
_Side note: In order to manually download the healthcare model from the models hub, I'm assuming I have to specify the secret. How do we do download that?_
Can you test if your license is valid by running it on this notebook?
Can you share the last versions you used? (java? pyspark?, spark-nlp?, spark-nlp-jsl?)
if you want to download manually? you can use this script, and in this notebook there is same example
from sparknlp.pretrained import ResourceDownloader
ResourceDownloader.downloadModelDirectly("clinical/models/embeddings_clinical_en_2.4.0_2.4_1580237286004.zip", "clinical/models")
The license works on notebook (tried on Collab).
Here are the versions used:
I followed https://nlp.johnsnowlabs.com/docs/en/install#windows-support that website @uzairahmadxy. I set up correctly. I didn't any bug. Please make sure all stage apply correctly.
You have to create java folder, spark folder, hadoop folder and tmp folder under the C folder. And then you have to make sure about set environment variable. Look at stage number 4 and 5.
Could you delete all things and then follow installation step? Thank you
@uzairahmadxy I notice you are using openJDK, but Adopt OpenJDK is recommended,
Hi guys. I'm trying to run spark NLP for healthcare locally and I seem to have the compatible versions of spark/java but it still throws an error (screenshots attached). Anyone face this?