Open DaBeIDS opened 1 year ago
Dear all,
it would be great to have an option to replace legal terms also within the word. Maybe not as default but as an option. For example:
company_cleaner_obj = CompanyNameCleaner() company_cleaner_obj.normalize_legal_terms = True df_clean = company_cleaner_obj.get_clean_df(df_test.copy(), 'COMPANY_NAME', 'COMPANY_NAME_CLEAN')
df_test = pd.DataFrame([[999, 'baupost group llc the']], columns=['ID', 'COMPANY_NAME']) df_clean = company_cleaner_obj.get_clean_df(df_test, 'COMPANY_NAME', 'COMPANY_NAME_CLEAN')
This would not replace llc by default. Of course one could first take the "the" away and then replace the legal term but in general it might be helpful.
In case nobody takes over i can also make a proposal on the change.
Best regards,
David
Dear all,
it would be great to have an option to replace legal terms also within the word. Maybe not as default but as an option. For example:
company_cleaner_obj = CompanyNameCleaner() company_cleaner_obj.normalize_legal_terms = True df_clean = company_cleaner_obj.get_clean_df(df_test.copy(), 'COMPANY_NAME', 'COMPANY_NAME_CLEAN')
df_test = pd.DataFrame([[999, 'baupost group llc the']], columns=['ID', 'COMPANY_NAME']) df_clean = company_cleaner_obj.get_clean_df(df_test, 'COMPANY_NAME', 'COMPANY_NAME_CLEAN')
This would not replace llc by default. Of course one could first take the "the" away and then replace the legal term but in general it might be helpful.
In case nobody takes over i can also make a proposal on the change.
Best regards,
David