Closed ataustin closed 6 years ago
I have this same error.
Here is the relevant code block:
else if (type == "numeric") {
# n0 <- length(which(x == 0))
# log <- FALSE
skw <- DistributionUtils::skewness(x, na.rm = TRUE)
hst <- graphics::hist(x, plot = FALSE)
res <- list(
type = type,
dist = list(
raw = list(
breaks = hst$breaks,
freq = hst$counts
)
)
)
res$log_default <- FALSE
if (!is.nan(skw) && skw > 1.5 && all(x >= 0, na.rm = TRUE)) {
# log <- TRUE
x <- x[x > 0]
x2 <- log10(x)
rng <- range(x2, na.rm = TRUE)
brks <- 10 ^ seq(rng[1], rng[2], length = grDevices::nclass.Sturges(x))
lhst <- hist(x, breaks = brks, plot = FALSE)
res$dist$log <- list(
breaks = lhst$breaks,
freq = lhst$counts
)
res$log_default <- TRUE
}
}
In this case, there is only 1 non-zero entry.
Skewedness is therefore > 1.5, but when x <- x[x > 0]
is evaluated, it will return a single number with which to produce breaks for the histogram, which will throw an error since you need more than one.
One fix could be to test the length of x[x > 0], and have a separate case if it is 1.
hist()
is a little weird in that it will accept a length-1 vector and interpret it as the number of cells in the histogram, rather than the break locations. For length-1 breaks
where the value of breaks
is larger than 1, it rounds down to the nearest integer and uses that as the number of cells. But for breaks
< 1 it gives the error.
> x <- 1:10
> hist(x, breaks = 1, plot = FALSE)$breaks
## [1] 0 10
>
> hist(x, breaks = 2.2, plot = FALSE)$breaks
## [1] 0 5 10
>
> hist(x, breaks = 0.4, plot = FALSE)$breaks
## Error in hist.default(x, breaks = 0.4, plot = FALSE) :
## invalid number of 'breaks'
>
> hist(x, breaks = -2, plot = FALSE)$breaks
## Error in hist.default(x, breaks = -2, plot = FALSE) :
## invalid number of 'breaks'
Ah yes, you are quite right. I suppose that behaviour makes it a bit odd to program with.
Still, I think the easiest fix would be to just keep whichever histogram came out at the beginning if there is only one value. Maybe there would be some side effects to that performance-wise or otherwise, but it could be as simple as :
if (!is.nan(skw) && skw > 1.5 && all(x >= 0, na.rm = TRUE) && length(x[x>0] > 1)
Happy to test that or a better fix on my machine later and submit a PR.
You would be my hero. I haven't had the time to do a PR but the bug is causing us some grief at the office.
Thank you! Yes please do test and submit a PR. Much appreciated. I apologize for being slow on this issue!
@hafen no worries, trelliscopejs
is :1st_place_medal:! I'm presenting on it to my colleagues tomorrow. :smiley:
@hafen no worries at all, we are all busy! I went ahead and submitted a PR.
Closed by https://github.com/hafen/trelliscopejs/pull/56 :fireworks:
This is surely an edge case, but one of my derived cognostics produced this error when building a trelliscope display:
Error in hist.default(x, breaks = brks, plot = FALSE) : invalid number of 'breaks'
I traced it back to
get_cog_distributions
, where in my casebreaks
was a single-element vector. The problem seems to occur whenbreaks
is between 0 and 1.Here is a reproducible example:
Here is some system information: