hansenlab / minfi

Devel repository for minfi
58 stars 67 forks source link

Upcoming change to DelayedArray::realize() #256

Closed hpages closed 4 months ago

hpages commented 7 months ago

Hi Kasper,

In the next few days I'm planning to make a small change to the behavior of DelayedArray::realize() that could potentially affect packages that use this function. Looks like minfi is one of them, sorry! :wink:

What will change?

Only the behavior of the default realize() method will change, and it will change only when the BACKEND argument is set to NULL. Right now (i.e. in DelayedArray <= 0.29.0), realize(x, BACKEND=NULL) realizes x in memory (as an ordinary array) and returns the in-memory array wrapped in a DelayedArray object. The change I'm considering is to get rid of the DelayedArray wrap, that is, to return the in-memory array naked.

Note that for any other BACKEND value, nothing will change, i.e. realize() will still be guaranteed to return a DelayedArray derivative (e.g. an HDF5Array object if BACKEND is set to "HDF5Array").

How will this affect your code?

If it's important for your code that realize() always returns a DelayedArray object or derivative, like it does at the moment, then simply replace any call to realize(...) with DelayedArray(realize(...)). This will preserve the current behavior even after the change. Note that calling the DelayedArray() constructor function on a DelayedArray object or derivative is a no-op. In other words, this extra step won't add the DelayedArray wrap again if there's already one.

Where/when will this change happen?

I will make this change to the devel branch of the DelayedArray package in the next few days. It will only (potentially) affect packages in BioC 3.19. Nothing will change in the RELEASE_3_18 branch of DelayedArray (BioC 3.18).

I will try to come up with a PR for minfi if I see that the change breaks it. Just wanted to give you a heads-up.

Cheers, H.

PeteHaitch commented 7 months ago

Thanks for the heads up, Hervé. Can you please explain a bit on the reasons for the change? Also, when BACKEND = NULL, will the in-memory array be an ordinary matrix/array or might it be e.g., a dgCMatrix if that was the in-memory seed of the DelayedArray?

hpages commented 7 months ago

Can you please explain a bit on the reasons for the change?

It's just a better semantic, I think, as there's little benefits in wrapping an ordinary array in a DelayedArray. If that is what the user wants, then they can always add the wrap explicitly. This will be explained in the man page for realize(). Also I'm currently working on some improvements to matrix multiplication between a DelayedMatrix object and an ordinary matrix where the new semantic for realize(x, BACKUP=NULL) will be more useful.

Also, when BACKEND = NULL, will the in-memory array be an ordinary matrix/array or might it be e.g., a dgCMatrix if that was the in-memory seed of the DelayedArray?

Excellent question! I'm also considering returning a SparseArray object instead of an ordinary array if is_sparse(x) is TRUE when BACKEND=NULL. The fact that the seed of x was sparse or not doesn't matter much, since sparsity can easily be destroyed by the delayed operations stacked on x. What really matters is if x itself is still parse or not. But that will be a separate change for later, after I'm done with the change described in this issue.

hpages commented 7 months ago

BTW yesterday I looked at the potential impact of this change on DelayedMatrixStats and I didn't see any (the code in the package always calls realize() with BACKEND="HDF5Array"). Hence why I didn't open an issue for DelayedMatrixStats :wink:

PeteHaitch commented 7 months ago

Cool, thanks for the answers and for taking a look at potential issues in dependencies. I'll take a closer look at my packages once the change is made in DelayedArray but I'll probably aim in the first instance with just keeping the package building and preserving current behaviour until I have time to think about the consequences.

hpages commented 5 months ago

@PeteHaitch I just made the change today. It's in DelayedArray 0.29.1. I'll monitor the Bioconductor daily builds this week and make sure to identify any glitch related to this change.

hpages commented 4 months ago

Looks like this change didn't cause any problem: https://bioconductor.org/checkResults/3.19/bioc-LATEST/minfi/

Closing this now.

PeteHaitch commented 4 months ago

Thanks, Hervé.