JohnSnowLabs / spark-nlp

State of the Art Natural Language Processing
https://sparknlp.org/
Apache License 2.0
3.88k stars 711 forks source link

Correct misspelled entities #13886

Closed mirix closed 10 months ago

mirix commented 1 year ago

Description

I am working with transcribed text. The general quality of the transcription is excellent but it contains a large number of misspelled entity names that are crucial for me.

For instance "Swisscode" or "Swiss Gold" instead of the correct "Swissquote" or "Alex Hormital" instead of the correct "ArcelorMittal".

Preferred Solution

Prior to reinventing the wheel, I was wondering if anyone would be aware of an existing NER tool that could correct such mistakes?

Otherwise, any tips on the best approach for implementing such solution would be greatly appreciated.

Additional Context

Ideally using custom word lists and being able to control de similarity parameters.

github-actions[bot] commented 10 months ago

This issue is stale because it has been open 180 days with no activity. Remove stale label or comment or this will be closed in 5 days