Closed gshotwell closed 7 years ago
Merging #1 into master will increase coverage by
2.32%
. The diff coverage is100%
.
@@ Coverage Diff @@
## master #1 +/- ##
========================================
+ Coverage 97.67% 100% +2.32%
========================================
Files 6 7 +1
Lines 43 68 +25
========================================
+ Hits 42 68 +26
+ Misses 1 0 -1
Impacted Files | Coverage Δ | |
---|---|---|
R/group-by.R | 100% <100%> (+8.33%) |
:arrow_up: |
R/collect.R | 100% <100%> (ø) |
Continue to review full report at Codecov.
Legend - Click here to learn more
Δ = absolute <relative> (impact)
,ø = not affected
,? = missing data
Powered by Codecov. Last update cacf7b9...8a0eb53. Read the comment docs.
LGTM. Did you say that there was something you forgot to add?
Aside from that, please
Authors@R
in DESCRIPTION### crplyr 0.1.5 (under development)
section to NEWS.md and add a bullet for thisand then merge.
Great, it'll need another look after the grouping stuff is added. Basically if someone does the following:
ds %>%
group_by(cyl) %>%
collect()
They should probably get a grouped tibble rather than an ungrouped one. This is important if we want to do something like automatically apply unresolved functions after the data is collected, and also so that something like this produces a group-wise summary:
ds %>%
group_by(cyl) %>%
collect() %>%
summarize(n = n())
@nealrichardson This should be ready for review. Changes are to better accommodate array variables.
👍 Once https://github.com/Crunch-io/rcrunch/pull/110 goes we should simplify collect
to use as.data.frame
and keep the implementation of that in crunch
.
New method to collect a CrunchDataset from the server. One gap which has been identified is that this method won't respect dataset order. Currently when a vector is collected from the server with
as.vector(v[3:1])
the API will returnv[1:3]
instead. This seems like an edge case forcrplyr
because people are more like to reorder a dplyr object usingarrange
than they are to pass a row index to the initial dataset.