JonathanYe3 / bergeys.webscraper

1 stars 1 forks source link

Split sentences into individual variables #7

Open lwaldron opened 4 years ago

lwaldron commented 4 years ago

This is a tricky one. You can start by creating another column where abstracts are split at every period, then using string matching to match a subset of frequently occurring pieces of information. Start small, with say a handful of variables that occur in most abstracts, then start expanding. Consider using stringr for splitting and matching, and potentially using purrr to extend work to the full dataset.