Closed koheiw closed 6 years ago
:exclamation: No coverage uploaded for pull request base (
master@5dc1df7
). Click here to learn what that means. The diff coverage is94.31%
.
@@ Coverage Diff @@
## master #120 +/- ##
=========================================
Coverage ? 85.22%
=========================================
Files ? 8
Lines ? 528
Branches ? 0
=========================================
Hits ? 450
Misses ? 78
Partials ? 0
Impacted Files | Coverage Δ | |
---|---|---|
R/readtext.R | 85.14% <100%> (ø) |
|
R/download.R | 100% <100%> (ø) |
|
R/utils.R | 94.82% <94.18%> (ø) |
Continue to review full report at Codecov.
Legend - Click here to learn more
Δ = absolute <relative> (impact)
,ø = not affected
,? = missing data
Powered by Codecov. Last update 5dc1df7...9d2fcd1. Read the comment docs.
Let's merge this and update CRAN.
My main question is: Do we need this as a separate function? The nice thing about readtext is that is has a single function that works in a huge range of cases. Direct download from a remote is supposed to be one of those cases. With the news cache
argument, repeated downloads do not need to be re-fetched from the remote. But do we then need a new exposed function (download()
) for this? How about making this just work directly through readtext()
?
Also the increasing complexity of the variety of ways that readtext()
can be used points to the need for a vignette (https://github.com/quanteda/quanteda/issues/996 but really should be an issue for readtext and be added to a pkgdown site as per #61, #65).
Does this fix solve any other open issues, e.g. #119 or #105?
I initially made it only accessible from readtext()
but I noticed that I want to download corpus objects. Since readtext()
always returns data.frame, this is inconsistent. This is why I separated it as download()
.
I commented on #119 and #105 too.
Is the available already in readtext and/or quanteda? Do I have update both packages?
Adds
download()
with cache functionality. Unlikequanteda.corpora::download()
, it only returns path the cache file, so that it can be used for all sorts of files.