lemmar utilizes tokenization and dictionary lookup for lemmatization of text. Lemmatization is defined as
"grouping together the inflected forms of a word so they can be analysed as a single item" (wikipedia)
While dictionary lookup of tokens is not a true morphological analysis, this style of lemma replacement is fast and typically still robust for many applications.
The dictionaries provided by this package come from http://www.lexiconista.com, and are made freely available under the ODbL 1.0 license (https://opendatacommons.org/licenses/odbl/summary). Please respect this license in reusing the data within the lemmar package including:
To download the development version of lemmar:
Download the zip
ball or tar
ball, decompress and
run R CMD INSTALL
on it, or use the pacman package to install the
development version:
if (!require("pacman")) install.packages("pacman")
pacman::p_load_current_gh("trinker/lemmar")
You are welcome to: