JohnSnowLabs / spark-nlp

State of the Art Natural Language Processing
https://sparknlp.org/
Apache License 2.0
3.8k stars 707 forks source link

Correct misspelled entities #13886

Closed mirix closed 7 months ago

mirix commented 1 year ago

Description

I am working with transcribed text. The general quality of the transcription is excellent but it contains a large number of misspelled entity names that are crucial for me.

For instance "Swisscode" or "Swiss Gold" instead of the correct "Swissquote" or "Alex Hormital" instead of the correct "ArcelorMittal".

Preferred Solution

Prior to reinventing the wheel, I was wondering if anyone would be aware of an existing NER tool that could correct such mistakes?

Otherwise, any tips on the best approach for implementing such solution would be greatly appreciated.

Additional Context

Ideally using custom word lists and being able to control de similarity parameters.

github-actions[bot] commented 8 months ago

This issue is stale because it has been open 180 days with no activity. Remove stale label or comment or this will be closed in 5 days