anspiess / propagate

Propagation of Uncertainty
6 stars 3 forks source link

Error in if (length < 0 || length > .Machine$integer.max) #4

Open khughitt opened 4 years ago

khughitt commented 4 years ago

Greetings!

I'm attempting to use propagate::bigcor() for the first time to compute the correlation matrix for a dataset with 52,232 columns:

propagate::bigcor(dat, use = 'pairwise.complete')

But I run into an issue relating to ff:

Error in if (length < 0 || length > .Machine$integer.max) stop("length must be between 0 and .Machine$integer.max") :                 
  missing value where TRUE/FALSE needed                                                                                               
Calls: <Anonymous> -> ff                                                                                                              
In addition: Warning message:                                                                                                         
In ff(vmode = "double", dim = c(NCOL, NCOL)) :                                                                                        
  NAs introduced by coercion to integer range                                                                                         
Calls: <Anonymous> -> ff                                                                                                              
Called from: ff(vmode = "double", dim = c(NCOL, NCOL))   

The system I'm testing this on has 128G memory.

Any ideas what the issue could be?

anspiess commented 4 years ago

Hi Keith,

hmm, at the moment unfortunately not... Have you tried to increase the "size" argument to maybe 10000, or omit "use"?

Cheers, Andrej


Von: Keith Hughitt [notifications@github.com] Gesendet: Montag, 31. August 2020 16:28 An: anspiess/propagate Cc: Subscribed Betreff: [anspiess/propagate] Error in if (length < 0 || length > .Machine$integer.max) (#4)

Greetings!

I'm attempting to use propagate::bigcor() for the first time to compute the correlation matrix for a dataset with 52,232 columns:

propagate::bigcor(dat, use = 'pairwise.complete')

But I run into an issue relating to ff:

Error in if (length < 0 || length > .Machine$integer.max) stop("length must be between 0 and .Machine$integer.max") : missing value where TRUE/FALSE needed Calls: -> ff In addition: Warning message: In ff(vmode = "double", dim = c(NCOL, NCOL)) : NAs introduced by coercion to integer range Calls: -> ff Called from: ff(vmode = "double", dim = c(NCOL, NCOL))

The system I'm testing this on has 128G memory.

Any ideas what the issue could be?

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHubhttps://github.com/anspiess/propagate/issues/4, or unsubscribehttps://github.com/notifications/unsubscribe-auth/ADJ6C5BXNFEKM6JLIFK452LSDOXR7ANCNFSM4QQSRITQ.


Universitätsklinikum Hamburg-Eppendorf; Körperschaft des öffentlichen Rechts; Gerichtsstand: Hamburg | www.uke.de Vorstandsmitglieder: Prof. Dr. Burkhard Göke (Vorsitzender), Joachim Prölß, Prof. Dr. Blanche Schwappach-Pignataro, Marya Verdel


SAVE PAPER - THINK BEFORE PRINTING

khughitt commented 4 years ago

@anspiess

Thanks for the quick response and suggestions. I tried modifying the use of size and use, but no luck.

It appears that the issue has to do with a limitation in ff.

In ff.R:2465, there is a call:

n <- as.integer(prod(dim))

When dim is too large (in my case, ~4.65e4 or larger), the product of the dimensions is too large, leading to an NA value after being cast with as.integer():

r$> prod(c(4.65e4, 4.65e4))                                                                                                             
[1] 2162250000

r$> as.integer(prod(c(4.65e4, 4.65e4)))                                                                                                 
[1] NA
Warning message:
NAs introduced by coercion to integer range 

I'll report the issue upstream, but since it should be easy to check for the above, it could be worth adding a quick check to bigcor() as well to warn users. Happy to submit a PR if you think this is something worth doing.

Cheers, Keith

anspiess commented 4 years ago

Ok, thanks. Just wondering if this could be resolved by using the "bigmemory" instead of "ff" package... But of course the large integer problem persists... Cheers.


Von: Keith Hughitt [notifications@github.com] Gesendet: Dienstag, 1. September 2020 15:43 An: anspiess/propagate Cc: Spiess, Andrej-Nikolai; Mention Betreff: Re: [anspiess/propagate] Error in if (length < 0 || length > .Machine$integer.max) (#4)

@anspiesshttps://github.com/anspiess

Thanks for the quick response and suggestions. I tried modifying the use of size and use, but no luck.

It appears that the issue has to do with a limitation in ff.

In ff.R:2465https://github.com/truecluster/ff/blob/master/R/ff.R, there is a call:

n <- as.integer(prod(dim))

When dim is too large (in my case, ~4.65e4 or larger), the product of the dimensions is too large, leading to an NA value after being cast with as.integer():

r$> prod(c(4.65e4, 4.65e4)) [1] 2162250000

r$> as.integer(prod(c(4.65e4, 4.65e4))) [1] NA Warning message: NAs introduced by coercion to integer range

I'll report the issue upstream, but since it should be easy to check for the above, it could be worth adding a quick check to bigcor() as well to warn users. Happy to submit a PR if you think this is something worth doing.

Cheers, Keith

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHubhttps://github.com/anspiess/propagate/issues/4#issuecomment-684863731, or unsubscribehttps://github.com/notifications/unsubscribe-auth/ADJ6C5EEJD7KQYMGWSDQ27LSDT3B5ANCNFSM4QQSRITQ.


Universitätsklinikum Hamburg-Eppendorf; Körperschaft des öffentlichen Rechts; Gerichtsstand: Hamburg | www.uke.de Vorstandsmitglieder: Prof. Dr. Burkhard Göke (Vorsitzender), Joachim Prölß, Prof. Dr. Blanche Schwappach-Pignataro, Marya Verdel


SAVE PAPER - THINK BEFORE PRINTING

alethere commented 1 month ago

I found the same issue with a matrix of 89000 columns. Does not seem that large... Not very reassuring that this was opened 4 years ago though.