Closed TheSimpleMachine closed 7 years ago
Hey @TheSimpleMachine,
Thanks for filing these two issues. Recipient examples are a bit sparse in our training data, so you'll sometimes get wonky behavior like this. Glad you figured out a good workaround with "ATTN".
I'm going to go ahead and add your examples to our next batch of training data. I'll close this issue when there's a new release and the behavior is fixed. If your data includes any more examples of valid addresses using recipient names, send 'em our way – more examples make the model more robust.
The states identified in example below are BH and CA usaddress.parse('165 WAIKIKI LN HUNTINGTON BH CA 92649-2326 ')
The states identified in the example below are BR and GA usaddress.parse('412 THURMON TANNER RD FLOWERY BR GA 30542-2824 ')
The Apartment 'A2' is identified as an additional StateName along with OH. usaddress.parse('503 DIERKER RD APARTMENT A2 COLUMBUS OH 43220-5248 ')
CT is identified as a StateName instead of the StreetnamePost type. MI is correctly identified as StateName. usaddress.parse('460 LESPERANCE CT ESSEXVILLE MI 48732-1911 ')
The states identified in the example below are LA and CA. usaddress.parse('827 CALLE SLORISTAS LA QUINTA CA 92253 ')
I appreciate these examples! I'll add them to our training data. It seems like you have more examples of problems 1 and 2, do you mind sharing them with us? You can just copy/paste the addresses here.
The latest release has provided better support for "CT", so if you upgrade the package you should find that error to be fixed.
I did update it to 0.5.8 today. CT example are fixed only in cases where only CT was a state. However, now when you have CT and a State, both are treated as states. Let me know if you ran the example above and it worked.
BH examples: 211 PARKSIDE LN # 138 HUNTINGTON BH CA 92647-7610 42 ALEXANDRIA DR HUNTINGTON BH CA 92647-2522 161 ROBERT LN HUNTINGTON BH CA 92647 162 HOBART LN HUNTINGTON BH CA 92647-4034 1982 FELCLIFF LN HUNTINGTON BH CA 92646-3923 1621 WAIKIKI LN HUNTINGTON BH CA 92649-2326 881 MIDBURY DR HUNTINGTON BH CA 92646-4643 521 PENINSULA LN HUNTINGTON BH CA 92648-6615 262 LEASURE LN HUNTINGTON BH CA 92646-7137 211 SAND DOLLAR LN HUNTINGTON BH CA 92646-7003 211 BANFF LN HUNTINGTON BH CA 92646-6801 942 SANTIAGO DR HUNTINGTON BH CA 92646-6337 671 GLENEAGLES CIR HUNTINGTON BH CA 92648-5561 62 SHIELDS DR HUNTINGTON BH CA 92647-4245 2121 GREENBORO LN HUNTINGTON BH CA 92646-7020 1675 LYNN ST # 3 HUNTINGTON BH CA 92649-5053 562 SPA DR HUNTINGTON BH CA 92647-2025 1421 BELMAR CIR HUNTINGTON BH CA 92647-2333 428 16TH ST # 2 HUNTINGTON BH CA 92648-4269 1768 EDGEWATER LN HUNTINGTON BH CA 92649-4208 1642 STILES CIR HUNTINGTON BH CA 92649-3065 158 PLYMOUTH LN HUNTINGTON BH CA 92647-3277 1051 EL CAPITAN LN HUNTINGTON BH CA 92646-0000 391 SAMOA DR HUNTINGTON BH CA 92646-2560 211 GREENBORO LN HUNTINGTON BH CA 92646-7020 1661 LIMELIGHT CIR # D HUNTINGTON BH CA 92647-5221 981 CATHAY CIR HUNTINGTON BH CA 92646-4817 141 BELMAR CIR HUNTINGTON BH CA 92647-2333 8892 SATTERFIELD DR HUNTINGTON BH CA 92646-7139
BR examples 51 LAKE RUN DR FLOWERY BR GA 30542-3886 90 THURMON TANNER RD FLOWERY BR GA 30542-2824 23 SPRING LAKE DR FLOWERY BR GA 30542-6601 17 TASCOSA DR FLOWERY BR GA 30542-5574 18 CARDINAL RIDGE WAY FLOWERY BR GA 30542-3551 21 HOLLAND VIEW DR FLOWERY BR GA 30542-5747 67 CHESTNUT HILL RD FLOWERY BR GA 30542-3822 15 KEEPSAKE LN FLOWERY BR GA 30542-7549 11 OLD ORR RD FLOWERY BR GA 30542-3444 85 MARTIN TRL FLOWERY BR GA 30542-3549 85 MARTIN TRL FLOWERY BR GA 30542-3549 85 MARTIN TRL FLOWERY BR GA 30542-3549 20 BATTLE RIDGE DR FLOWERY BR GA 30542-6103 53 THORNBURY CLOSE WAY FLOWERY BR GA 30542-3749 11 WILLOWBROOK TRL FLOWERY BR GA 30542-3893 42 BARRINGTON GRN FLOWERY BR GA 30542-4602 33 SHERWOOD MILL DR FLOWERY BR GA 30542-7525 60 ORVILLE DR FLOWERY BR GA 30542-5314 29 STILLWATER PL FLOWERY BR GA 30542-5317 54 TURK RD FLOWERY BR GA 30542-5131 18 SLEEPY LAGOON WAY FLOWERY BR GA 30542-7556 18 SLEEPY LAGOON WAY FLOWERY BR GA 30542-7556 02 ARBOR PT FLOWERY BR GA 30542-2613 46 STRICKLAND BLVD FLOWERY BR GA 30542-3637 85 MARTIN TRL FLOWERY BR GA 30542-3549
Weird! Here's my output:
In [5]: usaddress.parse('460 LESPERANCE CT ESSEXVILLE MI 48732-1911 ')
Out[5]:
[('460', 'AddressNumber'),
('LESPERANCE', 'StreetName'),
('CT', 'StreetNamePostType'),
('ESSEXVILLE', 'PlaceName'),
('MI', 'StateName'),
('48732-1911', 'ZipCode')]
What output are you seeing? Is it failing on other addresses?
HL examples: 411 STEVENS CT LAFAYETTE HL PA 19444-1748 33 PIN OAK CT LAFAYETTE HL PA 19444-2506 566 RED RAMBLER DR LAFAYETTE HL PA 19444-2109 881 RED RAMBLER DR LAFAYETTE HL PA 19444-2124 3314 PRUSSIAN HILL RD MOKELUMNE HL CA 95245-9644 62835 JESUS MARIA RD MOKELUMNE HL CA 95245-9658 232 MIMOSA CIR LAFAYETTE HL PA 19444-2407 4027 WISTERIA LN LAFAYETTE HL PA 19444-2111 3204 BASSWOOD DR LAFAYETTE HL PA 19444-2328 6428 BIRCH DR LAFAYETTE HL PA 19444-2125 665 BIRCH DR LAFAYETTE HL PA 19444-2103 83 LOCUST WAY LAFAYETTE HL PA 19444-2435 99 HONEY LOCUST CT LAFAYETTE HL PA 19444-2520 581 EMERSON DR LAFAYETTE HL PA 19444-1347 3136 ALAN DR WILLOUGHBY HL OH 44092-1208 653 FORSYTHIA CT LAFAYETTE HL PA 19444-2504 2308 GOWAN LN LAFAYETTE HL PA 19444-2028 924 FLOURTOWN RD LAFAYETTE HL PA 19444-1005 974 LOCUST WAY LAFAYETTE HL PA 19444-2435 6400 FOXHOUND DR LAFAYETTE HL PA 19444-1014 821 WOODRUFF RD LAFAYETTE HL PA 19444-1617 8025 WESTAWAY DR LAFAYETTE HL PA 19444-1541 756 N WARNER RD LAFAYETTE HL PA 19444-1427 2166 CENTER AVE LAFAYETTE HL PA 19444-1411 4103 FIELDS DR LAFAYETTE HL PA 19444-1531 3804 SHEEPS RUN LAFAYETTE HL PA 19444-1020 617 N SWEET GUM LN LAFAYETTE HL PA 19444-2600 762 RIDGE PIKE LAFAYETTE HL PA 19444-2018 716 HAWTHORNE CIR LAFAYETTE HL PA 19444-2416 707 RIDGE PIKE LAFAYETTE HL PA 19444-1721 353 GERMANTOWN PIKE LAFAYETTE HL PA 19444-1620 790 GERMANTOWN PIKE LAFAYETTE HL PA 19444-1109 904 PINE TREE RD LAFAYETTE HL PA 19444-1608 709 FOXWOOD CIR LAFAYETTE HL PA 19444-1646 1701 GERMANTOWN PIKE APT 1008 LAFAYETTE HL PA 19444-1158
PK Examples: 37015 S CENTRAL PARK AVE EVERGREEN PK IL 60805-3406 7426 KEMMAN AVE LA GRANGE PK IL 60526-1665 4506 RAMONA TER MACHESNEY PK IL 61115-3844 91322 JEFF DR. MACHESNEY PK IL 61115-7436 5943 RAYMOND AVE LA GRANGE PK IL 60526-1356 88456 S TROY AVE MERRIONETT PK IL 60803-4537 80407 W ORANGE DR LITCHFIELD PK AZ 85340-4151 16531 W SELLS DR LITCHFIELD PK AZ 85340-5160 50605 VENTURA BLVD MACHESNEY PK IL 61115-1062 19909 W SAN MIGUEL AVE LITCHFIELD PK AZ 85340-9514 91308 W ANNIKA DR LITCHFIELD PK AZ 85340-7364 46413 W SOLANO DR LITCHFIELD PK AZ 85340-7361 50514 W ROVEY CT LITCHFIELD PK AZ 85340-5373 36154 S DENNY BLVD APT 209 LITCHFIELD PK AZ 85340-9447 742 SCOTDALE RD LA GRANGE PK IL 60526-1033 906 JENNIE DR MACHESNEY PK IL 61115-1826 27806 W EARLL CT LITCHFIELD PK AZ 85340-8540 951 LAGUNA DR W LITCHFIELD PK AZ 85340-4741 32240 N SNOW HILL MANOR RD LEXINGTON PK MD 20653-3118 15240 N SNOW HILL MANOR RD LEXINGTON PK MD 20653-3118 96582 PERSHING DR LEXINGTON PK MD 20653-5254 56753 CROMWELL AVE FAIRVIEW PK OH 44126-2607 417 CASCADE DR INDIANHEAD PK IL 60525-4430 93285 TOWN CREEK DR LEXINGTON PK MD 20653-6346 9318 GRAND AVE HUNTINGTON PK CA 90255-6304 2516 W 91ST ST EVERGREEN PK IL 60805-1310 4616 W 91ST ST EVERGREEN PK IL 60805-1310 1865 RANDOLPH ST HUNTINGTON PK CA 90255-3029 2105 GRAND AVE HUNTINGTON PK CA 90255-6311 7417 W 100TH ST EVERGREEN PK IL 60805-2643 5206 HOPE ST HUNTINGTON PK CA 90255-6204 6418 W 101ST ST EVERGREEN PK IL 60805-3545 1018 W 101ST ST EVERGREEN PK IL 60805-3545 1518 W 101ST ST EVERGREEN PK IL 60805-3545 955 N CLOVERFIELD TER LITCHFIELD PK AZ 85340-6018 479 N BRAINARD AVE LA GRANGE PK IL 60526-1807 6232 W 101ST ST EVERGREEN PK IL 60805-3513 9036 CALIFORNIA ST HUNTINGTON PK CA 90255-5914 1536 W 101ST PL EVERGREEN PK IL 60805-3511 253 LIVE OAK ST HUNTINGTON PK CA 90255-6107 4364 LIVE OAK ST HUNTINGTON PK CA 90255-6108 9517 HOPE ST HUNTINGTON PK CA 90255-6211 7451 W MAPLE ST EVERGREEN PK IL 60805-3045 2514 GRAND AVE HUNTINGTON PK CA 90255-6236 3438 E 61ST ST HUNTINGTON PK CA 90255-3378 1609 VAN BUREN ST UNIVERSITY PK MD 20782-1414 319 BROADWAY HUNTINGTON PK CA 90255-6544 6519 BROADWAY HUNTINGTON PK CA 90255-6544 587 E PALM ST LITCHFIELD PK AZ 85340-4801 27 E PALM ST LITCHFIELD PK AZ 85340-4801 7811 SERENE CIR FRUITLAND PK FL 34731-6056 7115 E 60TH ST HUNTINGTON PK CA 90255-3407 375 N KENSINGTON AVE LA GRANGE PK IL 60526-1873 20359 CULLISON LN LEXINGTON PK MD 20653-3714 66359 CULLISON LN LEXINGTON PK MD 20653-3714 55583 VALLEY CT APT 8006 LEXINGTON PK MD 20653-1875 7101 N HIDDEN TER LITCHFIELD PK AZ 85340-5066 95340 LONG LN LEXINGTON PK MD 20653-4527 686 COLLEGE AVE FRUITLAND PK FL 34731-2314 9214 N PAJARO CT LITCHFIELD PK AZ 85340-3302 7819 MALABAR ST APT 16 HUNTINGTON PK CA 90255-7149 4819 MALABAR ST APT 16 HUNTINGTON PK CA 90255-7149 5619 MALABAR ST APT 16 HUNTINGTON PK CA 90255-7149 5661 LOMA VISTA AVE HUNTINGTON PK CA 90255-3342 345 N BRAINARD AVE LA GRANGE PK IL 60526-5501 779 N SPRING AVE LA GRANGE PK IL 60526-5541 961 SHERWOOD RD LA GRANGE PK IL 60526-5620 7231 SANTA FE AVE HUNTINGTON PK CA 90255-3805 2138 RITA AVE APT A HUNTINGTON PK CA 90255-4188 3711 GRAND POINT AVE UNIVERSITY PK FL 34201-2126 8023 MILES AVE APT A HUNTINGTON PK CA 90255-5046 8832 COCHISE DR INDIANHEAD PK IL 60525-4306 9840 COYOTE RIDGE CT UNIVERSITY PK FL 34201-2117 895 W MILLER ST FRUITLAND PK FL 34731-2244 933 N SPRING AVE LA GRANGE PK IL 60526 778 FOREST RD LA GRANGE PK IL 60526-1538 837 N BRAINARD AVE LA GRANGE PK IL 60526-1405 612 N CATHERINE AVE LA GRANGE PK IL 60526-1511 145 N WAIOLA AVE LA GRANGE PK IL 60526-1453 8359 S CALIFORNIA AVE EVERGREEN PK IL 60805-1122 8259 S CALIFORNIA AVE EVERGREEN PK IL 60805-1122 4759 S CALIFORNIA AVE EVERGREEN PK IL 60805-1122 7659 S CALIFORNIA AVE EVERGREEN PK IL 60805-1122 9027 S CLIFTON PARK AVE EVERGREEN PK IL 60805-1508 6727 S CLIFTON PARK AVE EVERGREEN PK IL 60805-1508 1827 S CLIFTON PARK AVE EVERGREEN PK IL 60805-1508 6746 S ALBANY AVE EVERGREEN PK IL 60805-1719 220 MEADOWCREST RD LA GRANGE PK IL 60526-1529 4405 S MILLARD AVE EVERGREEN PK IL 60805-1812 5105 S MILLARD AVE EVERGREEN PK IL 60805-1812 7927 S SPAULDING AVE EVERGREEN PK IL 60805-2200 517 S CENTRAL PARK AVE EVERGREEN PK IL 60805-3002 953 S RICHMOND AVE EVERGREEN PK IL 60805-2633 2158 S CENTRAL PARK AVE EVERGREEN PK IL 60805-3003
PT Examples: 2550 NE 27TH CT LIGHTHOUSE PT FL 33064-7712 1950 NE 39TH ST APT W203 LIGHTHOUSE PT FL 33064-7491 3540 NE 30TH ST LIGHTHOUSE PT FL 33064-7631 3440 NE 30TH ST LIGHTHOUSE PT FL 33064-7631 9251 NE 42ND CT APT 139 LIGHTHOUSE PT FL 33064-9047 3011 NE 34TH CT LIGHTHOUSE PT FL 33064-8148 1115 NE 25TH ST LIGHTHOUSE PT FL 33064-8346 6401 NE 45TH ST LIGHTHOUSE PT FL 33064-7242 6331 NE 36TH ST LIGHTHOUSE PT FL 33064-8568 9275 NE 48TH CT APT 105 LIGHTHOUSE PT FL 33064-7908 1350 NE 23RD AVE LIGHTHOUSE PT FL 33064-3904 5910 NE 26TH AVE LIGHTHOUSE PT FL 33064-8044 3610 NE 27TH AVE LIGHTHOUSE PT FL 33064-8056 2310 NE 27TH AVE LIGHTHOUSE PT FL 33064-8056 8330 NE 27TH AVE LIGHTHOUSE PT FL 33064-8060 8511 NE 28TH AVE LIGHTHOUSE PT FL 33064-7915 6941 NE 31ST AVE LIGHTHOUSE PT FL 33064-7838 7869 CAMP OKEE DR GLOUCESTER PT VA 23062-2502 7810 JORDAN RD GLOUCESTER PT VA 23062-2222
TW Examples: 1503 BEAR CREEK BLVD BEAR CREEK TW PA 18702-9441 960 RIVONA DR W BLOOMFLD TW MI 48328-4783 4735 BEAR CREEK BLVD BEAR CREEK TW PA 18702-9780 155 LAUREL RUN RD BEAR CREEK TW PA 18702-9468 871 MEADOW RUN RD BEAR CREEK TW PA 18702-9631 93 TWINBROOK RD BEAR CREEK TW PA 18702-8415
VY Examples: 2822 HENRIETTA AVE HUNTINGDON VY PA 19006-8504 6625 HUNTINGDON PIKE HUNTINGDON VY PA 19006-8307 6851 SHERMAN AVE HUNTINGDON VY PA 19006-8630 3555 HILLVIEW TURN HUNTINGDON VY PA 19006-2816 560 STEELE WAY HUNTINGDON VY PA 19006-3114 570 STEELE WAY HUNTINGDON VY PA 19006-3114 138 HILL HOUSE HUNTINGDON VY PA 19006-6906 2494 SOMERS RD HUNTINGDON VY PA 19006-1916 564 SOMERSET RD HUNTINGDON VY PA 19006-6723 9015 WRIGHT DR HUNTINGDON VY PA 19006-2725 70 LAWSON DR HUNTINGDON VY PA 19006-1607 9100 HEATON HILL LN HUNTINGDON VY PA 19006-3242 4280 BUCK HILL DR HUNTINGDON VY PA 19006-7910 1080 BUCK HILL DR HUNTINGDON VY PA 19006-7910 9208 COUNTY LINE RD HUNTINGDON VY PA 19006-1701 4656 OAK HILL DR HUNTINGDON VY PA 19006-7725 1400 BYBERRY RD STE 1100 HUNTINGDON VY PA 19006-3523 6800 MELMAR RD HUNTINGDON VY PA 19006-7971 5015 MELMAR RD HUNTINGDON VY PA 19006-7970 2535 CATHEDRAL RD HUNTINGDON VY PA 19006-5003 6335 CATHEDRAL RD HUNTINGDON VY PA 19006-5003 160 AUTUMN LEAF LN HUNTINGDON VY PA 19006-1526 2769 LIPPINCOTT RD HUNTINGDON VY PA 19006-7924 2250 WILLIAMSBURG RD HUNTINGDON VY PA 19006-6757 1638 KENT RD HUNTINGDON VY PA 19006-6621 538 KENT RD HUNTINGDON VY PA 19006-6621 6638 KENT RD HUNTINGDON VY PA 19006-6621 5338 KENT RD HUNTINGDON VY PA 19006-6621 7120 EDGE HILL RD. HUNTINGDON VY PA 19006-5618 403 COACHLIGHT TER HUNTINGDON VY PA 19006-3012 3042 BARRY LN HUNTINGDON VY PA 19006-5410 8821 HONEYSUCKLE LN HUNTINGDON VY PA 19006-5447 47 AMES CIR UNIT F4 HUNTINGDON VY PA 19006-7976 696 STOKES CIR HUNTINGDON VY PA 19006-7973 56 STOKES CIR HUNTINGDON VY PA 19006-7973 366 STOKES CIR HUNTINGDON VY PA 19006-7973 112 COUNTY LINE ROAD HUNTINGDON VY PA 19006-2303 367 SHADY LN HUNTINGDON VY PA 19006-8746 7190 MASONS MILL RD HUNTINGDON VY PA 19006-4409 2109 GREG LN HUNTINGDON VY PA 19006-3213 7709 GREG LN HUNTINGDON VY PA 19006-3213 2445 SUNSET WAY HUNTINGDON VY PA 19006-7753 1803 HOLT LN HUNTINGDON VY PA 19006-2610 9103 HOLT LN HUNTINGDON VY PA 19006-2610 7703 HOLT LN HUNTINGDON VY PA 19006-2610 949 EMERSON RD HUNTINGDON VY PA 19006-3022 3180 RIDGEVIEW RD HUNTINGDON VY PA 19006-3318 3596 SIDNEY RD HUNTINGDON VY PA 19006-2347 79 SIMONS WAY HUNTINGDON VY PA 19006-4248 2709 SHELLEY RD HUNTINGDON VY PA 19006-2344 924 ROCKLEDGE AVE HUNTINGDON VY PA 19006-8619 426 ANDREW RD HUNTINGDON VY PA 19006-2313 306 ANDREW RD HUNTINGDON VY PA 19006-2313 320 WINGATE RD HUNTINGDON VY PA 19006-8422 98 STEVEN DR HUNTINGDON VY PA 19006-6608 859 LONGVIEW DR HUNTINGDON VY PA 19006-2221 533 WELSH RD APT 503 HUNTINGDON VY PA 19006-6336 153 WELSH RD APT 503 HUNTINGDON VY PA 19006-6336 70 ELFRETH RD HUNTINGDON VY PA 19006-1211 86 ELFRETH RD HUNTINGDON VY PA 19006-1211 161 ROBIN LN HUNTINGDON VY PA 19006-2115 495 CLAIRE AVE HUNTINGDON VY PA 19006-8601 2205 1551 HUNTINGDON PIKE HUNTINGDON VY PA 19006-7715 12AD E COHEN 1057 TWIN SILO LN HUNTINGDON VY PA 19006-3341 9ZEL E HOPE 1601 HUNTINGDON RD HUNTINGDON VY PA 19006-4412
WD Examples: 95004 KINGSTON AVE HUNTINGTON WD MI 48070-1112 30064 NADINE AVE HUNTINGTON WD MI 48070-1516 96064 NADINE AVE HUNTINGTON WD MI 48070-1516 10095 TALBOT AVE HUNTINGTON WD MI 48070-1134 19154 LASALLE BLVD HUNTINGTON WD MI 48070-1162 38415 BORGMAN AVE HUNTINGTON WD MI 48070-1104 13535 ELGIN AVE HUNTINGTON WD MI 48070-1536 83127 NADINE AVE HUNTINGTON WD MI 48070-1420 33144 WINCHESTER AVE HUNTINGTON WD MI 48070-1727 1951 LINCOLN DR HUNTINGTON WD MI 48070-1626
I only have the following other examples for problem 1: 2260 PEACHTREE RD NW CONDO D1 ATLANTA GA 30309 9621 FOXHOUND DRIVE APARTMENT 1C MIAMISBURG OH 45342-5548 1568 LEXINGTON AVE APARTMENT 4L MANSFIELD OH 44907-2640
Here is a random example I came across that the parser could not handle Place and StateName correctly. 'Alesia Hixenbaugh 9 Front St Washington District of Columbia DC 20001'
Thanks for putting in some elbow grease on these examples – we very much appreciate this kind of thoroughness. I'll incorporate the new training data first thing tomorrow morning. As always, if you need a working build before then you'll be able to get good results by training your own model.
I am glad to help out, and thanks for pointing out the training model. However, I just started learning python (my first programming language) early last week on code academy (50% complete), and as such the training model instructions went over my head.
I am a quick study, and I want to learn to train my own model. I would really appreciate it if you could point me to a more detailed step by step guide, or send me sample code (with all the steps) that you have used in the past to train the model.
Specifically, the format the [infile] needs to be in and how to call on the file using python. parserator label [infile] training/labeled.xml usaddress
Also, how do I make pull requests to contribute back.
Sure thing! I'm putting together some detailed documentation for making new training data, as per issue #146. That should be live sometime tomorrow.
In the meantime, take a look at GitHub's documentation on how to fork a repo and how to make a pull request. The model training is done in the command line (Terminal if you're on Mac OS X), so you'll probably want to get familiar with command line scripting while you're at it.
The latest commit adds in your training data. The instructions for building and testing the code should get you a working copy that fixes the errors you listed above:
git clone https://github.com/datamade/usaddress.git
cd usaddress
pip install -r requirements.txt
python setup.py develop
parserator train training/labeled.xml usaddress
I'm going to wait to cut a new pip release, but this should get you rolling for now (and hopefully help you play with the library in a productive way).
I've also added a new piece of documentation that provides a step-by-step guide for making new training/testing data. You can find the testing data I made out of your examples in commit d3e716d, in the files training/thesimplemachine_train.xml
and measure_performance/test_data/thesimplemachine_test.xml
in case you want to play around with them. (In the latest commit I've appended them to the master training/testing files, which we like to do to keep things clean.)
As always, let me know if you have any questions!
Thanks. I am reviewing all the info you provided. Hope to be contributing soon.
It's been a while, so I'm not 100% sure where this is at, but it looks like we added the training data in. Closing.
Without comma we get the proper recipient tags and address tags: usaddress.parse('JASON BOURNE 90 CLAIRE GULCH CLANCY MT 59634-9528 ')
However, it commas in different spots does not appear to work. usaddress.parse('JASON BOURNE, 90 CLAIRE GULCH, CLANCY MT 59634-9528 ') usaddress.parse('JASON, BOURNE, 90 CLAIRE GULCH, CLANCY MT 59634-9528 ') usaddress.parse('JASON BOURNE, 90 CLAIRE GULCH, CLANCY, MT, 59634-9528')
However, adding c/o, ATT, ATTN, star, FOB etc infront of the above three will produce the correct result (as in the first case above). Example: usaddress.parse('ATT JASON BOURNE, 90 CLAIRE GULCH, CLANCY, MT, 59634-9528')