ropensci / RNeXML

Implementing semantically rich NeXML I/O in R
https://docs.ropensci.org/RNeXML
Other
13 stars 9 forks source link

Compatibility with dplyr 1.1.0 #254

Closed lionel- closed 1 year ago

lionel- commented 1 year ago

Hello, I see this failure in our revdep tests:

✖ | 1 1    87 | extract_metadata [1.1s]
──────────────────────────────────────────────────────────────────────────────
Failure (test_meta_extract.R:204): ID assignments are correct and complete when meta are nested
sort(meta.cont[, "Meta"]) not equal to sort(unique(meta.nested[, "meta"])).
names for target but not for current

I tried to investigate but the result change seems to happen in get_level() which uses a recursive routine that I found hard to debug. Could you please take a look? See the list of behaviour and breaking changes at https://dplyr.tidyverse.org/dev/news/index.html.

We're planning a release on January 27. A pre-emptive patch release of RNeXML would be helpful, if possible. Unless you think it's a problem in dplyr, in which case please let me know. Thanks!

hlapp commented 1 year ago

Thanks for reporting @lionel-. There isn't anything obviously connected in the behavior changes list. Can you say how to reproduce the problem? (I.e., which version of dplyr to install from where, using which version of R)

hlapp commented 1 year ago

Upon some inspection the test fails not because the elements of the vectors being compared are suddenly different, but because one suddenly has names. At present neither of the two has names, so the test succeeds.

There are several possibilities for this.

One is that the data frame returned by get_metadata() now has row names: https://github.com/ropensci/RNeXML/blob/9184bf9a96f8a54de881098d3e07d496275e3603/tests/testthat/test_meta_extract.R#L168 For this to be the causative change, dplyr::filter() would have to strip the row names: https://github.com/ropensci/RNeXML/blob/9184bf9a96f8a54de881098d3e07d496275e3603/tests/testthat/test_meta_extract.R#L196-L198 I'm not intimately enough familiar with dplyr, but this would seem quite unlikely to be its normal behavior. (Aside from the fact that I'm not sure what the rownames would be coming out of get_metadata().

The only other plausible possibility that one of the two vectors in the comparison would not have lost its column name: https://github.com/ropensci/RNeXML/blob/9184bf9a96f8a54de881098d3e07d496275e3603/tests/testthat/test_meta_extract.R#L204-L205

So it's a bit of a conundrum and likely worth finding out exactly from where names are coming but are then lost from one but not the other vector. For this specific test, names are irrelevant, so we can just change to testing element-wise equality.

cboettig commented 1 year ago

closed by #255

cboettig commented 1 year ago

(ps I don't think dplyr strips rownames, but I think sub-setting with df[, "colname"] does strip rownames. no idea why the dplyr update would break the original test, but oh well. Thanks for the patch, I'll push to CRAN soon)

hlapp commented 1 year ago

sub-setting with df[, "colname"] does strip rownames

And column name (if, as here, it's only one column). But the failure comes from one, and only one, of the two (the latter, if I understand "target" correctly) having names, even though both are obtained the same way (and the way you say).

lionel- commented 1 year ago

Thanks! I was also wondering what changed in dplyr to cause this, but I couldn't come up with a sufficiently small reprex. filter() does still preserve row names in the dev version.