Basically the function asks you, species by species to confirm (1) or not (any other letter or number) the match, and creates the column resolved_match_type accordingly (also gets rid of columns keep, match_edit_distance and match_similarity as suggested in the tutorial). Uses cli:: for "pretty" headers, which explains why I have this issue #49.
wcvp_manual_fuzzy_check<-function(df,original_name_col=NULL,original_author_col=NULL){
if (is.null(original_name_col)){
original_name_col<-"scientific_name"
}
if (is.null(original_author_col)){
original_author_col<-"authority"
}
resolved_match_type<-data.frame(resolved_match_type=rep(NA,nrow(df)))
a<-cbind(df,resolved_match_type)%>%
select(-keep,-match_edit_distance,-match_similarity)
print(names(a))
show<-data.frame(Info=c("Original","Fuzzy"),Species=NA,authors=NA)
for (i in 1:nrow(df)) {
show[1,2]<-df[[original_name_col]][i]
show[1,3]<-df[[original_author_col]][i]
show[2,2]<-df$wcvp_name[i]
show[2,3]<-df$wcvp_authors[i]
cli::cli_h1("Item {i} of {nrow(df)}")
print(show)
manual<-invisible(readline("Manual input required: Accept (Enter 1) / Reject (any letter/number): "))
if (manual == 1){
print(paste("Fuzzy match accepted. ",i,"/",nrow(df)))
} else {
print(paste("Fuzzy match rejected. ",i,"/",nrow(df)))
a$resolved_match_type[i]<-"Fuzzy match rejected"
}
}
a
}
Happy to send over a better example. Basically df is the object fuzzy_matches from your tutorial. I use different column names for species, but I think that if options original_name_col and original_author_col will default to your standard.
Hi again, I´ve wrote a simple function to avoid having to open Excel to decide if a Fuzzy match is valid or not (from tutorial here: https://matildabrown.github.io/rWCVP/articles/redlist-name-matching.html).
Basically the function asks you, species by species to confirm (1) or not (any other letter or number) the match, and creates the column
resolved_match_type
accordingly (also gets rid of columnskeep
,match_edit_distance
andmatch_similarity
as suggested in the tutorial). Usescli::
for "pretty" headers, which explains why I have this issue #49.Example:
Happy to send over a better example. Basically
df
is the objectfuzzy_matches
from your tutorial. I use different column names for species, but I think that if optionsoriginal_name_col
andoriginal_author_col
will default to your standard.