datamade / usaddress

:us: a python library for parsing unstructured United States address strings into address components
https://parserator.datamade.us/usaddress
MIT License
1.53k stars 303 forks source link

trouble training new data #216

Closed hurnhu closed 6 years ago

hurnhu commented 6 years ago

I've added 20 addresses for training, tagged them, and then trained using parserator train training/labeled.xml,training/new_addresses.xml usaddress.

but when I try to test the new traning with parserator label measure_performance/test_data/new_tests.csv measure_performance/test_data/new_tests.xml usaddress it still recommends the incorrect tags. the tags that it is recommending are exactly the same as the ones it recommenced when training. is this becuase i installed usaddress via pip first and then downloaded the git repo? as it does not seem to be using the new training model.

jeancochrane commented 6 years ago

It's possible that the version of usaddress that you got via pip is taking priority over the new model in your Python PATH. IIRC you should be able to run pip install -e <path/to/usaddress> to install a local version of a package with pip. You could also run pip uninstall usaddress first just to make sure you've cleared out the old version. Does that end up fixing your issue?

hurnhu commented 6 years ago

Yes this was due t the pip install taking over. Creating a python virtual environment fixed the problem

On Feb 21, 2018 1:23 PM, "Jean Cochrane" notifications@github.com wrote:

It's possible that the version of usaddress that you got via pip is taking priority over the new model in your Python PATH. IIRC you should be able to run pip install -e <path/to/usaddress> to install a local version of a package with pip. You could also run pip uninstall usaddress first just to make sure you've cleared out the old version. Does that end up fixing your issue?

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/datamade/usaddress/issues/216#issuecomment-367421963, or mute the thread https://github.com/notifications/unsubscribe-auth/AC-Qfc8TWe6qNXuQ9pPNXBCTio68ghYPks5tXF8qgaJpZM4SN8AM .

jeancochrane commented 6 years ago

Awesome! Closing this then. Feel free to open a new issue if you run into more problems.