ropensci / tokenizers

Fast, Consistent Tokenization of Natural Language Text
https://docs.ropensci.org/tokenizers
Other
184 stars 25 forks source link

Add description of TIF formatted data.frames as inputs #64

Closed kbenoit closed 6 years ago

kbenoit commented 6 years ago

README states that just character vectors and lists of characters are valid inputs, but fails to mention the tif data.frame format.

codecov-io commented 6 years ago

Codecov Report

Merging #64 into master will not change coverage. The diff coverage is n/a.

Impacted file tree graph

@@           Coverage Diff           @@
##           master      #64   +/-   ##
=======================================
  Coverage   99.32%   99.32%           
=======================================
  Files          12       12           
  Lines         443      443           
=======================================
  Hits          440      440           
  Misses          3        3

Continue to review full report at Codecov.

Legend - Click here to learn more Δ = absolute <relative> (impact), ø = not affected, ? = missing data Powered by Codecov. Last update cac930d...d4f7966. Read the comment docs.

lmullen commented 6 years ago

Thanks, @kbenoit. That does belong in the README. There is also a new vignette that describes compliance with TIF. https://github.com/ropensci/tokenizers/blob/master/vignettes/tif-and-tokenizers.Rmd