Open jdhoffa opened 3 years ago
Thanks @georgeharris2deg
I'm not sure if there is an internal reason that we decided to do this, but if it's possible it would be easier for the user to only have to manually validate these output one.
This output would be explained by us picking rows with distinct values of only id_loan
. We could probabbly detect the similarity in other columns. The decision seems to depend on how much of a problem this is and if it is worth adding the complexity in the code.
Updating that recent inspection shows that this is still the case:
library(r2dii.match)
lbk <- tibble::tribble(
~sector_classification_system, ~id_ultimate_parent, ~name_ultimate_parent, ~id_direct_loantaker, ~name_direct_loantaker, ~sector_classification_direct_loantaker, ~id_loan,
"NACE", "UP15", "Alpine Knits India Pvt. Limited", "C294", "Yuamen Xinneng Thermal Power Co Ltd", "D35.1", "L1",
"NACE", "UP15", "Alpine Knits India Pvt. Limited", "C294", "Yuamen Xinneng Thermal Power Co Ltd", "D35.1", "L2"
)
ald <- tibble::tribble(
~name_company, ~sector, ~alias_ald,
"alpine knits india pvt. limited", "power", "alpineknitsindiapvt ltd"
)
match_name(lbk, ald) %>%
dplyr::select(id_loan, name, sector, name_abcd, sector_abcd, score, level) %>%
prioritize()
#> # A tibble: 2 × 7
#> id_loan name sector name_abcd sector_abcd score level
#> <chr> <chr> <chr> <chr> <chr> <dbl> <chr>
#> 1 L1 Alpine Knits India Pvt. Limi… power alpine k… power 1 ulti…
#> 2 L2 Alpine Knits India Pvt. Limi… power alpine k… power 1 ulti…
Created on 2024-03-26 with reprex v2.1.0
In the reprex below, we see two almost identical loans, with two different values for
id_loan
. The corresponding output ofmatch_name
will have this repeated as many times as there are differentid_loan
.I'm not sure if there is an internal reason that we decided to do this, but if it's possible it would be easier for the user to only have to manually validate these output one.
Created on 2020-12-01 by the reprex package (v0.3.0)
AB#10177