Closed stefan-mueller closed 7 years ago
Just read that the antiword package does not seem to support .docx (yet). We might add this as a comment. Would be perfect if @kbenoit checks whether my information are correct. If you would like me to add additional sections or subsections, please let me know.
antiword doesn't but our package does. .docx is basically XML and we are able to import it that way.
This adds a folder
vignettes
which contains a readtext vignette. It is based both on?readtext
,README.Rmd
and own amendments. Please have a look whether there are better stringi solutions to remove page numbers based on a regular expression. Excluding page numbers is a common question, so I came up with two typical examples.