sepinf-inc / IPED

IPED Digital Forensic Tool. It is an open source software that can be used to process and analyze digital evidence, often seized at crime scenes by law enforcement or in a corporate investigation by private examiners.
Other
940 stars 218 forks source link

New Named Entity Recognition implementation #1550

Open lfcnassif opened 1 year ago

lfcnassif commented 1 year ago

Currently we use old version 3.8.0 and latest version is 4.5.2. Unfortunately they still don't have a Portuguese model, we should look for an alternative library to process pt texts.

lfcnassif commented 1 year ago

Reading these: https://medium.com/quantrium-tech/top-3-packages-for-named-entity-recognition-e9e14f6f0a2a https://spacy.io/usage/facts-figures#benchmarks-speed

Seems SpaCy can be +10x faster than StanfordCoreNLP on the CPU. It also has support for 70+ languages, including portuguese, not supported by StanfordCoreNLP.