JohnSnowLabs / spark-nlp

State of the Art Natural Language Processing
https://sparknlp.org/
Apache License 2.0
3.87k stars 712 forks source link

SPARK_HOME Issue + DeIdentification Fix + Normalizer Improvement #310

Closed danilojsl closed 2 years ago

danilojsl commented 5 years ago
  1. There is an issue whit pip installation when SPARK_HOME variable is setup.
  2. There is a fix required in DeIdentification unit tests.
  3. There is a pending improvement in Normalizer.

Description

Installation of spark-nklp with Python package is working in all environments, but when SPARK_HOME is setup, it does not find sparknlp.jar

Expected Behavior

  1. When using spark-nlp under pyhton, it should locate sparknlp.jar automatically.
  2. Unit tests should not use pretrained models.
  3. Normalizer should work as always after the improvement.

Current Behavior

When SPARK_HOME is setup, spark-nlp is not able to find sparknlp.jar, thus a python user requires to set spark session manually.

Possible Solution

Steps to Reproduce

  1. Set SPARK_HOME
  2. Install spark-nlp through python package (pip)
  3. In a python terminal write: from sparknlp.base import * spark_nlp_session = SparkNLP().spark_session
  4. spark-nlp cannot find the jar

Context

Your Environment

maziyarpanahi commented 4 years ago

Is this issue resolved? @danilojsl

github-actions[bot] commented 2 years ago

This issue is stale because it has been open 120 days with no activity. Remove stale label or comment or this will be closed in 5 days