Open rychoo2 opened 7 years ago
Perhaps the issue here is that there is no space between the name and the suffix? What countries are these companies based in?
Correct, as long as there is a white space it is parsed ok. These companies are based in USA and China but I believe the key is that probably the data was entered in China where they're not used to white spaces. I believe the library could be immune to that.
I see how this could be an issue, but only because you didn't clean up your data first. What is typical is that there is whitespace and then the entity abbreviation. That is how everyone writes these business name strings. I don't think the script should look for whitespace and/or any non character symbol and then run a lookup; I don't think it is responsible for adding spaces after symbols either.
Edit: Yes, spaces and a trailing comma are removed, only because (again) this is a standard way to write a business name.
I have seen the entity abbreviation being separated by a comma (more often comma + whitespace, actually). Although I'd agree that whitespace (no comma) is a more common separator.
I guess we could replace commas with whitespace as a preprocessing step? I am a little surprised we did not already have this :) In any case, I don't have time to work on that.
As @psolin pointed out, replacing commas with whitespace would probably be an easy data cleanup workaround.
Hello,
Very nice module but it doesn't always handle well some real human entered company names we deal a lot with. Below some obvious examples where the name is not parsed:
LIBGAS,LTD -> LIBGAS,LTD AIRDAS USA,LLC -> AIRDAS USA,LLC GF LOGISTICS.INC -> GF LOGISTICS.INC HAKUTATZ.TECH.CO.,LTD. -> HAKUTATZ.TECH.CO.,LTD
Thanks