JohnSnowLabs / spark-nlp

State of the Art Natural Language Processing
https://sparknlp.org/
Apache License 2.0
3.77k stars 705 forks source link

Hardcoded column name in DocumentSimilarityRanker annotator #14195

Open jfernandrezj opened 4 months ago

jfernandrezj commented 4 months ago

Is there an existing issue for this?

Who can help?

@danilojsl @maziyarpanahi

What are you working on?

Getting similar documents from a knowledge base for RAG applications

Current Behavior

Both current behavior and desired behavior is documented in the branch with the failing tests: https://github.com/JohnSnowLabs/spark-nlp/tree/issues/document-similarity-ranker-failing-test

Expected Behavior

Both current behavior and desired behavior is documented in the branch with the failing tests: https://github.com/JohnSnowLabs/spark-nlp/tree/issues/document-similarity-ranker-failing-test

Steps To Reproduce

Just run the added tests in the branch: https://github.com/JohnSnowLabs/spark-nlp/tree/issues/document-similarity-ranker-failing-test

Spark NLP version and Apache Spark

Spark 3.4 Spark NLP 5.2.2

Type of Spark Application

No response

Java Version

Java 11

Java Home Directory

No response

Setup and installation

No response

Operating System and Version

No response

Link to your project (if available)

No response

Additional Information

No response