psolin / cleanco

Company Name Processor written in Python
MIT License
316 stars 94 forks source link

cleanco('AMBA').clean_name() is empty #57

Open dbradshaw opened 3 years ago

petri commented 3 years ago

What should it be, and why?

dbradshaw commented 3 years ago

I don't know. Maybe leave it as AMBA. Not familiar with the company.

On Wed, Sep 30, 2020, 4:33 PM Petri Savolainen notifications@github.com wrote:

What should it be?

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/psolin/cleanco/issues/57#issuecomment-701629465, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAQ2SHXTSGQ6J7UGWBHKLIDSIOI3JANCNFSM4PL3GYDQ .

petri commented 3 years ago

We cannot fix this if we don't know what the result should be. Would you at least know in which country this "AMBA" is in?

dbradshaw commented 3 years ago

No. I came across it using cleanco and it caused a problem which I had to work around.

I don't think I can help you further. Thanks for looking into it.

-David

On Sat, Oct 3, 2020 at 10:01 AM Petri Savolainen notifications@github.com wrote:

We cannot fix this if we don't know what the result should be. Would you at least know in which country this "AMBA" is in?

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/psolin/cleanco/issues/57#issuecomment-703108377, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAQ2SHTCYIGDEFAJGEZ2A73SI4VEFANCNFSM4PL3GYDQ .

FBnil commented 1 year ago

This will always be a problem, a company that is named after a termdata (in this case from Denmark). terms_by_country might need to be limited to the countries the user expects/does business with or, the code needs to be changed that when you remove the last term, and the result is empty, rather than empty, you return the first word in the received string, as a "most probable name". To not break backwards compatibility, a new function would need to be made (or a flag might be set to change the behavior).

print(basename('AMBA')) # "" print(basenameorfirst('AMBA')) # "AMBA" print(basename('inc & co')) # "" print(basenameorfirst('inc & co')) # "inc"

This would allow for companies name like 'inc & co', where even though inc is a term, it's also the name.