datamade / usaddress

:us: a python library for parsing unstructured United States address strings into address components
https://parserator.datamade.us/usaddress
MIT License
1.52k stars 304 forks source link

Confidence score to validate the output #337

Open Aj-232425 opened 2 years ago

Aj-232425 commented 2 years ago

Hi,

This is wonderful lib. While digging in, i had come across one query. That too required badly. What if i want to apply it to millions of addresses. How would I got to know if some % of address are wrong parsed/normalized/classified. Is there any way to find confidence score/probability of output being some x% true and we can set it as threshold value. So above that threshold we don't need to validate manually. Hoping for response. Thanks & Regards, Aj