Closed privefl closed 7 years ago
This package
https://bioconductor.org/packages/release/bioc/html/bigmemoryExtras.html
implements a ReferenceClass that knows the backing path and can re-attach as necessary. For example, if you reload the object from an RData file and then use it, it will attach itself to the on-disk data and then carry on. The package also has a factor type and some optimizations related to the dimnames.
Pete
Peter M. Haverty, Ph.D. Genentech, Inc. phaverty@gene.com
On Thu, Dec 22, 2016 at 12:16 AM, Florian Privé notifications@github.com wrote:
Is there a way to know where a filebacked big.matrix is stored on disk (the directory)? If not, I think it should be easy to add one slot to the big.matrix object with its stored backingpath so that we can directly attach or sub a big.matrix without asking the user to specify the directory (backingpath). Or maybe add it to the description object instead?
Do you want to do it? I think I can do it if you want to. If not, I will have to make an object that extends a big.matrix.
— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/kaneplusplus/bigmemory/issues/55, or mute the thread https://github.com/notifications/unsubscribe-auth/AH02K3VS2NZuOIXTP7w6XBCoMr9dFfkIks5rKjHvgaJpZM4LTvaj .
I really like the safety feature.
Yet, I thought more of having an object with two classes (one that extends big.matrix
& big.matrix
) so that you can (seamlessly) use all functions available for big.matrix
objects. In order to let people choose if they want to use the extension or not (or include it directly as part of a big.matrix
).
I understand that the big.matrix
object is accessed via $bigmat
in your BigMatrix
. So, if I want to use a sub.big.matrix, I can use sub.big.matrix(X$bigmat, lastCol = 50, backingpath = dirname(X$backingfile))
?
In fact, I just want to be able to use sub.big.matrix
without having to specify the backingpath
parameter. I am a heavy user of sub.big.matrix
for the purpose of parallelism on column blocks.
The class you describe is exactly what I wanted too. The auto-attach feature required the magic of ReferenceClasses and activeBindingFunctions, though. I hoped to make big.matrix and BigMatrix interchangeable by giving them the same API. However, it would still be nice to have the two share a super class so S4 dispatch would do the right thing.
I'd be open to making a simpler BigMatrix-like thing and putting the shared code in bigmemory, but I'd have to think a bit about what the simpler object would be.
Pete
Peter M. Haverty, Ph.D. Genentech, Inc. phaverty@gene.com
On Thu, Dec 22, 2016 at 1:26 PM, Florian Privé notifications@github.com wrote:
I really like the safety feature.
Yet, I thought more of having an object with two classes (one that extends big.matrix & big.matrix) so that you can use (seamlessly) all functions available for big.matrix objects. In order to let people choose if they want to use the extension or not (or include it directly as part of a big.matrix).
I understand that the big.matrix object is accessed via $bigmat in your BigMatrix. So, if I want to use a sub.big.matrix, I can use sub.big.matrix(X$bigmat, lastCol = 50, backingpath = dirname(X$backingfile))?
In fact, I just want to be able to use sub (or attach) without having to specify the backingpath parameter.
— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/kaneplusplus/bigmemory/issues/55#issuecomment-268893976, or mute the thread https://github.com/notifications/unsubscribe-auth/AH02K_KdFsq3qMiHsWhU-SgUqZgAq2mhks5rKur1gaJpZM4LTvaj .
I am analysing some of the code to see if I could change this behaviour without affecting users that use the path
extra parameter.
I've come accros these lines of code: https://github.com/kaneplusplus/bigmemory/blob/master/R/bigmemory.R#L1836-L1840.
As the new address is created with readOnly, I think the condition is always true.
I wanted to be sure before removing these 5 lines of code in the new version I will suggest.
Edit: Ok, we can change permissions with chmod like in this test: https://github.com/kaneplusplus/bigmemory/blob/master/tests/testthat/test_readonly.R#L63-L66.
So the second question: the test should be (is.readonly(ret) && !readOnly)
?
Could you review this PR: https://github.com/kaneplusplus/bigmemory/pull/56?
Is there a way to know where a filebacked
big.matrix
is stored on disk (the directory)? If not, I think it should be easy to add one slot to thebig.matrix
object with its storedbackingpath
so that we can directlyattach
orsub
abig.matrix
without asking the user to specify the directory (backingpath
). Or maybe add it to the description object instead?Do you want to do it? I think I can do it if you want to. If not, I will have to make an object that extends a big.matrix.