ryscott5 / eparTextTools

BSD 3-Clause "New" or "Revised" License
2 stars 5 forks source link

eparTextTools

This tookit provides a set of resources for analyzing textual documents using the R programming language for the conduct of portfolio analysis and review. The tools rely on text mining, natural language processing, and machine learning programs developed by other R users and as such heavily relies on code developed by other packages. Thus, it may be thought of as a set of tools enabling portfolio analysis rather than a new package for conduct of text analysis.

![Model of Program](https://www.lucidchart.com/publicSegments/view/ecbf4945-8913-479b-ab7d-0f44e5553d30/image.png =300x200)

The tools work towards two broad goals. First of all, the tools provide a flexible framework for describing and classifying the content of textual documents. This includes analysis of word frequencies, description of common words, testing for correlations between words, and categorization of strings of text into modeled or human coded categories of topics. The text tools, as designed, support query-based description, such as "how often does EPAR research involve the words policy analysis versus program evaluation?" However, they also allow a user to explore documents by allowing the documents to suggest word correlations, commonalities, and topics.

The vignettes below provide descriptions of common tasks.

Vignettes

Intro to Portfolio Review

Google Directions

Extracting Information from Web Forms