This is a tricky one. You can start by creating another column where abstracts are split at every period, then using string matching to match a subset of frequently occurring pieces of information. Start small, with say a handful of variables that occur in most abstracts, then start expanding. Consider using stringr for splitting and matching, and potentially using purrr to extend work to the full dataset.
This is a tricky one. You can start by creating another column where abstracts are split at every period, then using string matching to match a subset of frequently occurring pieces of information. Start small, with say a handful of variables that occur in most abstracts, then start expanding. Consider using stringr for splitting and matching, and potentially using
purrr
to extend work to the full dataset.