datamade / usaddress

:us: a python library for parsing unstructured United States address strings into address components
https://parserator.datamade.us/usaddress
MIT License
1.51k stars 303 forks source link

ValueError: The tagger is not opened #285

Open samratrcs opened 4 years ago

samratrcs commented 4 years ago

Following is the address I am trying to tag and I am getting an error. Can you please help

import usaddress addressTag = usaddress.tag('2202 W Overland, Scottsbluff, NE 69361, United States')

C:\PyProject\StateContracts\usaddress__init.py:144: UserWarning: You must train the model (parserator train --trainfile FILES) to create the usaddr.crfsuite file before you can use the parse and tag methods 'and tag methods' % MODEL_FILE) Traceback (most recent call last): File "C:/PyProject/StateContracts/addressMapper/addressBreakerPOC.py", line 21, in addressTag = usaddress.tag('2202 W Overland, Scottsbluff, NE 69361, United States') File "C:\PyProject\StateContracts\usaddress__init__.py", line 166, in tag for token, label in parse(address_string): File "C:\PyProject\StateContracts\usaddress\init__.py", line 155, in parse tags = TAGGER.tag(features) File "pycrfsuite_pycrfsuite.pyx", line 630, in pycrfsuite._pycrfsuite.Tagger.tag File "pycrfsuite_pycrfsuite.pyx", line 688, in pycrfsuite._pycrfsuite.Tagger.set ValueError: The tagger is not opened

mortgagemetrix commented 4 years ago

I tried your code and got the following result: (OrderedDict([('AddressNumber', '2202'), ('StreetNamePreDirectional', 'W'), ('StreetName', 'Overland'), ('PlaceName', 'Scottsbluff'), ('StateName', 'NE'), ('ZipCode', '69361'), ('CountryName', 'United States')]), 'Street Address')

Did you compile usaddress from code or did you install using pip? If you compiled it did you try the nosetests? as FYI I'm running this on macos in a virtual environment. Python version 3.7.3.

CaffeineLab commented 1 year ago

I had this issue when building with pyinstaller so I'll post my solution in case it helps someone out.

You may be able to just fix the path to the crfsuite file and add this to your .spec file and get it working:

datas=[ ('venv/Lib/site-packages/usaddress/usaddr.crfsuite', './usaddress'), ], hiddenimports=['pycrfsuite._dumpparser', 'pycrfsuite._logparser'],

This will basically create the folder usaddress in your application and copy the existing usaddr.crfsuite file in there so that your tagger will find it at runtime. The hiddenimports resolved other issues which I figured should just be here as well.

To find where you want to copy the file to, simply decipher/output the value of MODEL_PATH from Lib/site-packages/usaddress/init.py to find where the library is looking for the usaddr.crfsuite file.

Hope this helps someone out.