Open jenghub opened 2 years ago
Thanks for reporting this issue @jenghub. This feature was just released recently and I will ask @vankov to have a look at this.
Any updates for this? I'm encountering pretty much the same error, except that z:com.microsoft.spark.notebook.visualization.display.getDisplayResultForIPython
in OP's error is oXXXX.showString
for me (where XXXX
is some number). I noticed that the error starts occurring when the input has more than around 120 words.
sparknlp.version()
: 5.1.4
spark.version
: 3.5.0
java -version
: 1.8.0_392
Installed via pip install
OS: Ubuntu 20.04.5 LTS
Other details: this is being done offline, loaded spark.jars
in the SparkSesssion from local file and loaded the model from the copy on Models Hub.
I am facing the same issue, any updates on this?
Attempting to perform coreference resolution on a large dataframe containing English-language documents.
Description
Attempting to run
SpanBertCoref()
in a pipelineAbove block downloads the models correctly but then errors out on the coreference resolution step
Error:
Possible Solution
The below demo code works properly. Are there potentially issues in the tokens or sentences being detected? Some of the documents can have some artifacts as they may be translated into English from an earlier preprocessing step.
Context
Attempting to use a spark-based coreference resolution as other libraries are not compatible with my spark environment or are slow and error out.
Your Environment
sparknlp.version()
: 4.0.2spark.version
: 3.1.2.5.0-66290225java -version
: 1.8.0_282