This package implements both the discrete and continuous maximum likelihood estimators for fitting the power-law distribution to data. Additionally, a goodness-of-fit based approach is used to estimate the lower cutoff for the scaling region.
109
stars
24
forks
source link
bootstrap_p with lognormal dist. returns checkForRemoteErrors(val), unexpectedly #78
Dear Mr. Gillespie
I have an offline desktop at the statistical agency and use poweRlaw to the private data set summarized as follows (na values are also omitted):
Windows 7, i5, 4GB RAM, R 3.5.1, R Studio 1.1.456, poweRlaw 0.70.1, VGAM 1.0-6, rtools3.5
(all newly downloaded and installed from file)
I initiate the package over 12 years' data as documented. compare_distributions works fine for all. Using bootstrap_p on lognormal distribution, while I get the p-value for some years, 2013-2012-2008-2005 data return the following error with different Gb values:
library("poweRlaw")
m_ln = dislnorm$new(v2013)
est = estimate_xmin(m_ln)
m_ln$setXmin(est)
bs_p = bootstrap_p(m_ln, no_of_sims = 1, xmins = seq(140,160,2), threads = 4, seed = 1)
---time estimation message as usual here---
Error in checkForRemoteErrors(val) : four nodes produced an error: cannot allocate vector of size
405.6 Gb
This incredible vector size goes up towards hundreds of thousands Gb depending on the inputs I provide to the _bootstrapp such as _no_ofsims, or xmins. This issue was raised here 3 years ago but not solved.
I have checked:
-All my data are separate vectors and I call them separately (no loops added by me).
-Object sizes for data vectors are in hundreds of Kbs and similar to each other.
-It is strange that only a few of them and only for lognormal dist. returns the error. All power law p-values are calculated.
-I have found a suggession here under Memory Load section but could not utilize it.
I'm sorry that I cannot provide data for reproduction due to the regulations. Hope this get things work.
Dear Mr. Gillespie I have an offline desktop at the statistical agency and use poweRlaw to the private data set summarized as follows (na values are also omitted):
My work environment:
I initiate the package over 12 years' data as documented.
compare_distributions
works fine for all. Using bootstrap_p on lognormal distribution, while I get the p-value for some years,2013-2012-2008-2005
data return the following error with different Gb values:This incredible vector size goes up towards hundreds of thousands Gb depending on the inputs I provide to the _bootstrapp such as _no_ofsims, or xmins. This issue was raised here 3 years ago but not solved. I have checked: -All my data are separate vectors and I call them separately (no loops added by me). -Object sizes for data vectors are in hundreds of Kbs and similar to each other. -It is strange that only a few of them and only for lognormal dist. returns the error. All power law p-values are calculated. -I have found a suggession here under
Memory Load
section but could not utilize it.I'm sorry that I cannot provide data for reproduction due to the regulations. Hope this get things work.
Thanks for this super package and your attention.