Closed HelenaLC closed 1 year ago
- revised tests (including plotting) bringing us to ~90% coverage
- tests are a little more complex/data-driven now (e.g., no hard coded numbers, in case we change the example data)
- also fixed a potential bug in
rename()
(renaming special columns, e.g.,.cell
, previously gave an error that the column doesn't exists, but should work as intended now)- also updated the example data (
pbmc_small
) viaBiocGenerics::updateObject()
) as it was outdated, causing some tests to fail (due to differences in internal metadata)
Thanks @HelenaLC ! That's great. I left 3 comments with some little changes or explanations.
Love the PR.
Also, sorry to be a pain. Since we are changing the test already, could I please ask to use base-R pipe?
I am trying to get rid of dplyr pipe wherever I can.
Yes, absolutely! I actually prefer not using any pipes internally, but didn't want to fiddle too much with the style. But agreed, I'll use base R pipes instead of %>%
.
(Side note: perhaps we could have a brief chat re linting. I am mainly focusing on Bioc-guidelines plus some simplifications, and I am leaning towards more base R vs. tidy-style, which I think would make things cleaner (and briefer) in many (not all) cases, at least internally...)
Also, sorry to be a pain. Since we are changing the test already, could I please ask to use base-R pipe? I am trying to get rid of dplyr pipe wherever I can.
Yes, absolutely! I actually prefer not using any pipes internally, but didn't want to fiddle too much with the style. But agreed, I'll use base R pipes instead of
%>%
.
Base pipes after R evaluation they disappear. It is just a visual-coding aid. So makes no difference for R.
(Side note: perhaps we could have a brief chat re linting. I am mainly focusing on Bioc-guidelines plus some simplifications, and I am leaning towards more base R vs. tidy-style, which I think would make things cleaner (and briefer) in many (not all) cases, at least internally...)
Yes thta's a more complex topic. We could discuss about that. It is good to respect standards, but we are practically contributing to changing them. And it is becoming evident to many. When I think about standards, I try to imagine what they will be in 5-10 years. Things are moving fast :)
Personal opinion: I think pipes make the code more readable in virtually all cases.
Also, sorry to be a pain. Since we are changing the test already, could I please ask to use base-R pipe? I am trying to get rid of dplyr pipe wherever I can.
Yes, absolutely! I actually prefer not using any pipes internally, but didn't want to fiddle too much with the style. But agreed, I'll use base R pipes instead of
%>%
.Base pipes after R evaluation they disappear. It is just a visual-coding aid. So makes no difference for R.
(Side note: perhaps we could have a brief chat re linting. I am mainly focusing on Bioc-guidelines plus some simplifications, and I am leaning towards more base R vs. tidy-style, which I think would make things cleaner (and briefer) in many (not all) cases, at least internally...)
Yes thta's a more complex topic. We could discuss about that. It is good to respect standards, but we are practically contributing to changing them. And it is becoming evident to many. When I think about standards, I try to imagine what they will be in 5-10 years. Things are moving fast :)
Personal opinion: I think pipes make the code more readable in virtually all cases.
Absolutely. I did not mean to speak against pipes in general (illusional or not). Currently, I am trying my best to keep them for workflow-like operations, where they do make things easier to write and read. And only not use them when it really is a lot easier to follow (in my opinion), e.g., I struggled more to understand things like ... |> colnames() |> duplicated() |> which() |> length() |> gt(0)
or ... |> duplicated() |> any() |> '!'()
as opposed to !any(duplicated(colnames(...)))
, which literally reads "not any duplicated colnames". All that to say: Yes, I agree that they are easier to read in many cases, but perhaps not all (and also more cumbersome to debug at times...).
Independent of personal opinions / style-preferences, my case against using dplyr/tidyr
inside packages (unless it does make things a LOT easier) is mainly because they are quite volatile at times. I have had to make quite frequent packages updates because of some syntax changes, deprecations etc. throwing warnings. So find it best to avoid them for simple things when possible without much magic.
Aaanyways, hoping to finish a first iteration first (towards a clean-ish BiocCheck
) and we can see from there and change things that are not appreciated. Like I said, I am not against pipes (use them a lot!) and am keeping them in the vast majoirty of cases (like ~90%). Mainly trying to assure consistent use of <-/=
, spacing, line-width, not using @
for accession, etc.
*One example re the above, when()
has been deprecated in favor of base R if (...) ... else(...) ...
:/
is mainly because they are quite volatile at times. I have had to make quite frequent packages updates because of some syntax changes, deprecations etc
true
Just wondering before I finalize the next one, what's happening with this PR? Is there something you'd like to be changed explicitly (and perhaps kept for another PR) before merging?
I see.
So .cell is an immutable column.
A solution can be
- colData() |> as_tibble(rownames = EXISTING_INTERNAL_FUNCTION_THAT_ENCODES_CELL_COLNAME)
- rename
- drop .cell column so it's untouched
- warning ("tidySingleCellExperiment says: ...") If .cell is in the query (there could be multiple columns in the query)
Hello @HelenaLC, is the above still pending? I think this is the last part missing, to be modified fro your PR. All the rest look fantastic.
Alright. I implemented a fix that I think does what it should, i.e., special columns (.cell
, PC
s etc.) cannot be renamed NOR be used as target names; it is a little more complicated than what was there before, but does the trick.
Alright. I implemented a fix that I think does what it should, i.e., special columns (
.cell
,PC
s etc.) cannot be renamed NOR be used as target names; it is a little more complicated than what was there before, but does the trick.
Great, thanks for that! We were not getting the expected error, so I slightly changed the logic.
Also, in the future, we can check expect_error
specifying the expected error message.
Thanks for this PR!
rename()
(renaming special columns, e.g.,.cell
, previously gave an error that the column doesn't exists, but should work as intended now)pbmc_small
) viaBiocGenerics::updateObject()
) as it was outdated, causing some tests to fail (due to differences in internal metadata)