psolin / cleanco

Company Name Processor written in Python
MIT License
318 stars 94 forks source link

Fix issues with legal forms matching and add tests #73

Closed marekmodry closed 3 years ago

marekmodry commented 3 years ago

There were several issues that are being addressed here: 1) The matching worked only for companies with one token legal forms (such as plc, ltd) etc 2) The matching was returning ambiguous results that could be disambiguated. When the input name had gmbh & co. kg as a legal form, the matching returned countries and types only for gmbh which is a different legal form. So now there is a mechanism that handles this. 3) There were tests for basename cleaning but nothing for matching the legal forms, so I added at least basic tests.