tidyverse / ggplot2

An implementation of the Grammar of Graphics in R
https://ggplot2.tidyverse.org
Other
6.47k stars 2.02k forks source link

`geom_smooth()` fails for every group if only one group is wrong #5352

Closed eliocamp closed 11 months ago

eliocamp commented 1 year ago

It seems that geom_smooth() will not draw any smooth line even if the computation fails for just one of the many groups.

This doesn't work:

library(ggplot2)
packageVersion("ggplot2")
#> [1] '3.4.2.9000'

diamonds |> 
  subset(cut == "Ideal" & color == "J") |>
  ggplot( aes(carat, price)) +
  geom_point(aes(color = clarity)) +
  geom_smooth(aes(color = clarity))
#> `geom_smooth()` using method = 'loess' and formula = 'y ~ x'
#> Warning in simpleLoess(y, x, w, span, degree = degree, parametric = parametric,
#> : span too small.  fewer data values than degrees of freedom.
#> Warning in simpleLoess(y, x, w, span, degree = degree, parametric = parametric,
#> : at 0.9598
#> Warning in simpleLoess(y, x, w, span, degree = degree, parametric = parametric,
#> : radius 0.00010404
#> Warning in simpleLoess(y, x, w, span, degree = degree, parametric = parametric,
#> : all data on boundary of neighborhood. make span bigger
#> Warning in simpleLoess(y, x, w, span, degree = degree, parametric = parametric,
#> : pseudoinverse used at 0.9598
#> Warning in simpleLoess(y, x, w, span, degree = degree, parametric = parametric,
#> : neighborhood radius 0.0102
#> Warning in simpleLoess(y, x, w, span, degree = degree, parametric = parametric,
#> : reciprocal condition number 1
#> Warning in simpleLoess(y, x, w, span, degree = degree, parametric = parametric,
#> : at 3.0202
#> Warning in simpleLoess(y, x, w, span, degree = degree, parametric = parametric,
#> : radius 0.00010404
#> Warning in simpleLoess(y, x, w, span, degree = degree, parametric = parametric,
#> : all data on boundary of neighborhood. make span bigger
#> Warning in simpleLoess(y, x, w, span, degree = degree, parametric = parametric,
#> : There are other near singularities as well. 0.00010404
#> Warning in simpleLoess(y, x, w, span, degree = degree, parametric = parametric,
#> : zero-width neighborhood. make span bigger

#> Warning in simpleLoess(y, x, w, span, degree = degree, parametric = parametric,
#> : zero-width neighborhood. make span bigger
#> Warning: Computation failed in `stat_smooth()`
#> Caused by error in `predLoess()`:
#> ! NA/NaN/Inf in foreign function call (arg 5)

The issue is that the "I1" clarity group has just two observations, and therefore geom_smooth() can't fit anything. If I remove those data, now it works:

diamonds |> 
  subset(cut == "Ideal" & color == "J" & clarity != "I1") |> 
  ggplot( aes(carat, price)) +
  geom_point(aes(color = clarity)) +
  geom_smooth(aes(color = clarity))
#> `geom_smooth()` using method = 'loess' and formula = 'y ~ x'

This is surprising because I would've thought that each group is fitter independently and even if one group fails to fit, the rest should work.

Created on 2023-07-17 with reprex v2.0.2

teunbrand commented 1 year ago

I'd agree that this is undesired behaviour that perhaps should be tryCatch()'ed in the StatSmooth$compute_group() method.