ropensci / skimr

A frictionless, pipeable approach to dealing with summary statistics
https://docs.ropensci.org/skimr
1.11k stars 79 forks source link

Failure with dev testthat #608

Closed hadley closed 4 years ago

hadley commented 4 years ago

Quite a few skimr tests fail on dev testthat:

-- FAILURE (test-dplyr.R:8:3): dplyr::filter works as expected -----------------
Results have changed from known value recorded in 'dplyr/filter-skim.txt'.

new[1:4] vs old[1:4]
+ "-- Data Summary ------------------------"
- "── Data Summary ────────────────────────"

This is probably because dev testthat now sets "cli.unicode" = FALSE to ensure that you get more reproducible output across operating systems. This change appears to be affecting relatively few packages, and I think on the whole it's a useful improvement, so would you mind the updating skimr tests to set for the current version of testthat? I'm happy to help out if needed.

michaelquinn32 commented 4 years ago

Thanks for the heads up Hadley! We'll take a look at it soon.

On Thu, Aug 27, 2020 at 10:29 AM Hadley Wickham notifications@github.com wrote:

Quite a few skimr tests fail on dev testthat:

-- FAILURE (test-dplyr.R:8:3): dplyr::filter works as expected -----------------

Results have changed from known value recorded in 'dplyr/filter-skim.txt'.

new[1:4] vs old[1:4]

  • "-- Data Summary ------------------------"

  • "── Data Summary ────────────────────────"

This is probably because dev testthat now sets https://testthat.r-lib.org/reference/local_test_context.html "cli.unicode" = FALSE to ensure that you get more reproducible output across operating systems. This change appears to be affecting relatively few packages, and I think on the whole it's a useful improvement, so would you mind the updating skimr tests to set for the current version of testthat? I'm happy to help out if needed.

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/ropensci/skimr/issues/608, or unsubscribe https://github.com/notifications/unsubscribe-auth/AA2QEAPDDAQPPV3H3LV2YOTSC2JWBANCNFSM4QNHQGAQ .

-- Best wishes, Michael

Cell: (917) 808 6508

elinw commented 4 years ago

Interesting that it looks like the rendering of the ---- has changed. Maybe a change in something else like cli.

hadley commented 4 years ago

Sorry for not being clear — this change is because testthat is now deliberately setting options(cli.unicode = FALSE) so that cli never uses unicode characters (because otherwise those tests will fail when run on windows because cli won't use the fancy unicode characters there).

elinw commented 4 years ago

Right. We already skip a lot of tests on Windows for that reason, but maybe this lets us look at that again. Of course we have a ton of utf 8 because of the spark graphs.

elinw commented 4 years ago

@hadley Besides the issue of the lines, I noticed that in tibbles ~ is being used instead of ellipsis to truncate names and ... instead of ellipsis in the metadata for tibble wrapping which in some cases causes the line wrapping to change. Is that final -- I don't want to change a bunch of tests to use ~ if you may change your mind and use another character. Since ellipsis is getting replaced differently in different places it seems not obvious to me. Of course I've only looked at the first 10 failures.

Possibly in some places we should be using local_reproducible _output() to test some specific things.

I guess this may end up with us having to support both versions of testthat if we want to smoothly not break when

hadley commented 4 years ago

None of that has changed; it's just a consequence of setting cli.unicode = FALSE. If you set that yourself now (e.g. with withr::local_options(list(cli.unicode = FALSE)) you'll get code that works reliably across platforms and versions of testthat.

elinw commented 4 years ago

Sorry I guess for me having ellipsis sometimes fall back to one thing (the way I would expect i.e. to ...) and other times to another thing (~) is odd, perhaps that's just me. But it's not the normal thing to have that happen in code.

With a little digging what I found is that cil already handles windows differently for "continue" here ... continue sometimes renders as an ellipsis but it really is not an ellipsis after all.

https://github.com/r-lib/cli/blob/cd4374c736e0c65f54ec56e40d6ba432b6c74b7b/R/symbol.R#L151 https://github.com/r-lib/cli/blob/cd4374c736e0c65f54ec56e40d6ba432b6c74b7b/R/symbol.R#L151 https://github.com/r-lib/cli/blob/cd4374c736e0c65f54ec56e40d6ba432b6c74b7b/R/symbol.R#L46

I didn't dig further but that explains why we are now getting "~" in those specific places --- they are really "continue" and not "ellipsis." Hence we get the below.

+ "  skim_type n_missing complete_rate factor.ordered factor.n_unique factor.top_coun~ numeric.mean"
- "  skim_type n_missing complete_rate factor.ordered factor.n_unique factor.top_coun… numeric.mean"
+ "1 factor            0             1 FALSE                        3 set: 50, ver: 5~        NA   "
- "1 factor            0             1 FALSE                        3 set: 50, ver: 5…        NA   "
+ "# ... with 15 variables: skim_type <chr>, skim_variable <chr>, n_missing <int>,"
- "# … with 15 variables: skim_type <chr>, skim_variable <chr>, n_missing <int>, complete_rate <dbl>,"
+ "#   complete_rate <dbl>, factor.ordered <lgl>, factor.n_unique <int>, factor.top_counts <chr>,"
- "#   factor.ordered <lgl>, factor.n_unique <int>, factor.top_counts <chr>, numeric.mean <dbl>,"

I suspect I won't be the only one puzzled by this, even if it seems super obvious to people deep in the weeds of rlib.

hadley commented 4 years ago

I suspect we have two characters because sometimes (as in variable names) you don’t want to go from a single character to three characters, whereas in other places it’s fine. But I didn’t write this code; I’m just speculating based on what seems reasonable to me.

elinw commented 4 years ago

It makes total sense that cli is handling it in that way for that reason. I was not going to make changes without understanding why the changes were happening the way they were happening. I doubt cli is the only display oriented package that will already have Windows work arounds that will look like they are behaving in non-standard ways with this change.

609

Only horizontal rules that are broken by text are failing currently, probably due to how those fancy rules are constructed.

hadley commented 4 years ago

So far I've tested ~2000 packages, and there have been < 5 failures caused by this change to testthat. Overall, I think it's an improvement going forward (since you tests will automatically give the same output on windows and linux), but I'll probably make this something that you need to explicitly opt-in to by switching to the third edition.

elinw commented 4 years ago

Yeah, we are such a print oriented package with a lot of print tests, that is why. It does seem to be a change that makes sense.

On Aug 29, 2020, at 9:54 AM, Hadley Wickham notifications@github.com wrote:

So far I've tested ~2000 packages, and there have been < 5 failures caused by this change to testthat. Overall, I think it's an improvement going forward (since you tests will automatically give the same output on windows and linux), but I'll probably make this something that you need to explicitly opt-in to by switching to the third edition.

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/ropensci/skimr/issues/608#issuecomment-683293309, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAFYI7JHUPMK7MSSBRLOOTDSDEB7XANCNFSM4QNHQGAQ.