tidyverse / dplyr

dplyr: A grammar of data manipulation
https://dplyr.tidyverse.org/
Other
4.76k stars 2.12k forks source link

select_if(<logical vector>) #4213

Closed grabear closed 5 years ago

grabear commented 5 years ago

Hello all! :smiley:

:new_moon: Intro

During a recent Pull Request (https://github.com/vallenderlab/MicrobiomeR/pull/79), I discovered that our TravisCI build was breaking for some reason. After a few hours of testing code, I still couldn't figure it out. So I created a new PR to test the travis build (See https://github.com/vallenderlab/MicrobiomeR/pull/82 for our process of elimination).

:first_quarter_moon: Troubleshooting Workflow

Via https://github.com/vallenderlab/MicrobiomeR/pull/82 and the same local error that @sdhutchins was having, we discovered that the problem was with dplyr::select_if:

Here are the Travis builds: Original branch - https://travis-ci.com/vallenderlab/MicrobiomeR/builds/101668726 Debugging branch - https://travis-ci.com/vallenderlab/MicrobiomeR/builds/101679286

Here is the code in context: https://github.com/vallenderlab/MicrobiomeR/blob/91eb2c1b9bf124b01cdc662722f484f2665b92cf/R/utils.R#L330-L351

Because the code was running on my local machine and not @sdhutchins computer and the travis servers, we had the chance to compare our libraries. https://github.com/vallenderlab/MicrobiomeR/pull/82 has the session info for mine and @sdhutchins local machines. I copied the dependencies from a previously successful travis build and the current breaking build and found the following differences:

package bad version bad date good version good date
ggsignif 0.5.0 2/20/2019 0.4.0 8/3/2017
microbiome 1.5.28 2/20/2019 1.5.27 2/12/2019
phyloseq 1.27.2 2/20/2019 1.25.2 2/12/2019
ggthemes 4.1.0 2/19/2019 4.0.1 8/24/2018
ellipsis 0.1.0 2/19/2019 1 1/1/1900
forcats 0.4.0 2/17/2019 0.3.0 2/19/2018
dplyr 0.8.0.1 2/15/2019 0.7.8 11/10/2018
rotl 3.0.7 2/15/2019 3.0.6 1/20/2019
R6 2.4.0 2/14/2019 2.3.0 10/4/2018
shades NA   1.3.1 2/14/2019

It seems that dplyr updated recently, and that's what was breaking our code.

:full_moon: Solution

We started to both begin messing with our DESCRIPTION file along with our .travis.yml file. We tried to set dplyr's version to 0.7.8, but the travis build "force" installed the newest version (0.8.0.1) regardless..

image

So after some digging I changed the travis config file by adding the lines:

install:
- R -e 'devtools::install_deps(dependencies = TRUE, upgrade = "never")'

This seems to fix the builds but only for:

r:
- release
- devel

:last_quarter_moon: Other Issues

The other R version builds:

r:
- 3.3.3
- 3.3.2
- 3.3.1
- 3.3.0

will fail because they are newly added and I'm assuming it has something to do with the package cache on travisCI. They fail because devtools is not installed.

Additionally it might have something to do with the recent purrr update as well? I'm not sure.

Any help here would be appreciated. I'm not sure if we're doing something wrong in our code, or if we should change something in our Travis build so that it doesnt force install dplyr 0.8.0.1. The current fix doesn't seem like a very good long term solution.

dplyr - @romainfrancois @hadley purrr/devtools - @lionel @jennybc @jimhester

Sorry to tag so many, but I thought one of you would at least be able to help us. We speant ~7.5 hours yesterday trying to troubleshoot the problem ourselves and got to a good point. But we are still uncomfortable merging our PR. https://github.com/vallenderlab/MicrobiomeR/pull/79

Cheers, @grabear

romainfrancois commented 5 years ago

Thanks for the details. I’ll have a look in the morning

romainfrancois commented 5 years ago

Here is a stripped down version of the problem:

library(dplyr, warn.conflicts = FALSE)

ids <- "Sepal.Length"
iris %>% 
  select_if(!names(.) %in% ids)
#> Error: Can't create call to non-callable object

This is not how select_if() works, if it used to work with 0.7.8 it was purely accidental. Here are some alternatives:

library(dplyr, warn.conflicts = FALSE)

iris <- head(iris, 3L)
ids <- "Sepal.Length"
iris %>% 
  select(-ids)
#>   Sepal.Width Petal.Length Petal.Width Species
#> 1         3.5          1.4         0.2  setosa
#> 2         3.0          1.4         0.2  setosa
#> 3         3.2          1.3         0.2  setosa

iris %>% 
  select_at(setdiff(names(.), ids))
#>   Sepal.Width Petal.Length Petal.Width Species
#> 1         3.5          1.4         0.2  setosa
#> 2         3.0          1.4         0.2  setosa
#> 3         3.2          1.3         0.2  setosa
lionel- commented 5 years ago

Actually it was on purpose that you could supply logical vectors instead of a predicate, this is following purrr semantics for map_if().

grabear commented 5 years ago

Thanks for looking into this. Any ideas on the Travis CI bit? Or should I ask this somewhere else?

grabear commented 5 years ago

Thanks for looking into this. Any ideas on the Travis CI bit? Or should I ask this somewhere else?

Here is a thread to address any TracisCI related fixes: https://travis-ci.community/t/travis-build-ignoring-r-package-version-in-description/2431

lock[bot] commented 5 years ago

This old issue has been automatically locked. If you believe you have found a related problem, please file a new issue (with reprex) and link to this issue. https://reprex.tidyverse.org/