Closed trinker closed 7 years ago
Misplaced the last commit reference to issue https://github.com/trinker/termco/issues/10:
https://github.com/trinker/termco/commit/80a57852df546329d09803e7ae9e9a550e00be8d
summary
function and a source_term_list
to pull term list but not check (used to get individual regexes for testing.
Close (the check collapses and removes dupes. Maybe separate these 2 tasks??):
source_term_list <- function(file, indices = NULL, ...){
## ensure file exists
stopifnot(file.exists(file))
## read in categories file
cats <- source(file)[[1]]
## determine if hierarchical
type <- ifelse(
is.list(cats[[1]]) && length(cats) > 1 && all(sapply(cats, is.list)),
'termco_nested',
'termco_unnested'
)
switch(type,
termco_nested = {
## check for empty tiers
cats <- term_lister_empty_hierarchy_check(cats)
## grab the warnings for later printing
first_pass <- lapply(cats, function(x) {
tryCatch(term_lister_check(x, G = obj), warning=function(w) w)
})
locs <- unlist(lapply(first_pass, inherits, what = 'simpleWarning'))
main <- 'The categories within the following hierarchies (tiers) contained no regular expressions and were dropped:\n\n'
if (any(locs)) {
tiers <- which(locs)
messages <- paste(paste0(
'Tier ', tiers, ": ",
paste0('\n', gsub('^.+?\\n\\n|\\n\\n$', '', unlist(lapply(first_pass[locs], `[[`, 'message')))),
'\n'
), collapse = '\n')
warning(paste0(main, messages))
}
}
)
if (!is.null(indices)) {
warning('`indices` set: dropping `term_list` elements...\n\nUse `indices = NULL` to keep all elements')
cats <- cats[indices]
}
return(cats)
}
Function to read in external data or objects and handle formatting and warnings/errors.
This will make forming a
term_list
easier and can be used inside theterm_count
function (others as well??token_count
) to handle the term_lists checking.