Open wangzhixuan opened 7 years ago
Hey @wangzhixuan,
Thanks for filing this! Seems like a similar issue to the one Ben identified in #132. We've been meaning to provide better support for lots and trailers. Any idea what SPC stands for?
If you want to add your own training data and submit a PR, we have a new guide up for that now. Otherwise, can you paste in a few examples for each occupancy type? 4-6 examples for each pattern should be enough.
There are actually more cases, but the 3 I mentioned above are the most common occupancy types that usaddress
cannot recognize. I remember seeing other SIDE
, FRNT
, BAY
, PH
etc. as well.
You can find the meaning of those abbreviations here http://www.expertmarket.com/USPS-street-suffix
In addition to more training data, we could do something for occupancy types like we did for street types and directionals https://github.com/datamade/usaddress/blob/master/usaddress/__init__.py#L262
@fgregg I agree with you.
@fgregg @wangzhixuan Agreed!
@jeancochrane
I tried retrain the model myself, but then the nosetests
comes up with too many failed cases. I don't have time to look into them so I decide to put my additional training examples here.
"451 County Route 11 LOT 56, West Monroe, NY 13167"
"192 State Highway 1959 Lot 26, Grayson, KY 41143"
"12475 State Highway 180 LOT 26, Gulf Shores, AL 36542"
"W7772 Wisconsin Pkwy Lot 8B, Delavan, Wisconsin, 53115"
"13809 Bandera St Trlr 3, Houston, TX, 77015"
"TRLR 153-168, 8028 Wichita St, Fort Worth, TX 76140"
"6485 Us Highway 10 W Trlr 52 , Missoula, Montana, 59808"
"900 Broken Feather Trl TRLR 324, Pflugerville, TX 78660"
"641 N SCRAPER ST TRLR 3, VINITA OK 74301"
"176 SE COUNTY ROAD Y LOT 25, WARRENSBURG MO 64093"
"3933 E AZ Highway 260 Spc 155. Payson, AZ 85541"
"1624 N Coast Highway 101 Spc 53, Encinitas, CA 92024"
"601 Pacheco Rd SPC 116, Bakersfield, CA, 93307"
"9020 W Avenue J Spc 25, Lancaster, CA 93536"
"351 H Avenue, Building 442, San Francisco, CA 94130"
These are all addresses from internet.
I'm sorry to hear that! If you have time to sort through the testing failures I'd be happy to provide some help troubleshooting. Thanks for pasting the examples here – I can take a stab at a fix when I'm back from vacation in two weeks.
It seems to me that some valid occupancy types are never correctly recognized. For example
Lot
,TRLR
,SPC
.