Sotera / newman

Quickly analyze and explore email with advanced analytics and visualization.
http://sotera.github.io/newman/
Apache License 2.0
55 stars 14 forks source link

Entity extraction performance #128

Open smahoney58 opened 5 years ago

smahoney58 commented 5 years ago

The Entity Extraction feature is not highlighting all the Entities in the email body . Jane Doe was identified as an entity (Person). I expected Jane Doe to be highlighted in the email body.

John Doe for whatever reason got correctly highlighted. The fake mastercard number got incorrectly highlighted in that only part of the number was highlighted instead of the entire 16 digits. Other numbers that got identified in this particular email also did not get highlighted.

smahoney58 commented 5 years ago

o Steps to reproduce  Search for test. This finds all 9 emails in the dataset.  Select the second email “test email” o Expected it to mark “Jane Doe” as a person (it found John Doe). Expected the system to highlight the other numeric entities. Expected all 16 digits of the fact mastercard to get marked.