FRBCesab / funbiogeo

:package: R package to help with analyses in functional biogeography
https://frbcesab.github.io/funbiogeo/
GNU General Public License v2.0
10 stars 1 forks source link

fb_plot_number_traits_by_species() shows only number of traits up to 100% of species #105

Closed Rekyt closed 7 months ago

Rekyt commented 7 months ago

Bug description

fb_plot_number_traits_by_species() stops to display results when it reaches 100% of species. For example, if all species have at least 3 known traits, it won't show lines for 1 or 2 traits but only for ≥3 traits.

Reproducible example

library("funbiogeo")

set.seed(20240318)
# fake dataset of 10 traits for 100 species
sp100_tr10 <- matrix(rnorm(1000,0,1),100,10,
                     dimnames = list(paste0("sp",1:100), paste0("tr",1:10)))

# adding some NA: across all traits
sp100_tr10NA <- sp100_tr10
sp100_tr10NA[which(sp100_tr10>-0.2 & sp100_tr10< 0.2)] <- NA

# formatting as required
species_10traitsNA <- data.frame(species=row.names(sp100_tr10), sp100_tr10NA)
summary(species_10traitsNA) # OK NA across all traits
#>    species               tr1                tr2                tr3          
#>  Length:100         Min.   :-2.06491   Min.   :-1.91453   Min.   :-1.97087  
#>  Class :character   1st Qu.:-0.85940   1st Qu.:-0.75505   1st Qu.:-0.84492  
#>  Mode  :character   Median :-0.20620   Median :-0.25510   Median :-0.34265  
#>                     Mean   : 0.04538   Mean   : 0.02796   Mean   :-0.04758  
#>                     3rd Qu.: 1.01398   3rd Qu.: 0.72303   3rd Qu.: 0.82698  
#>                     Max.   : 2.40200   Max.   : 3.94352   Max.   : 2.14825  
#>                     NA's   :21         NA's   :21         NA's   :12        
#>       tr4                tr5               tr6               tr7          
#>  Min.   :-1.81014   Min.   :-2.5732   Min.   :-3.3907   Min.   :-1.71370  
#>  1st Qu.:-0.99794   1st Qu.:-0.8868   1st Qu.:-0.7514   1st Qu.:-0.86893  
#>  Median :-0.31622   Median :-0.3951   Median :-0.2689   Median :-0.36765  
#>  Mean   :-0.06366   Mean   :-0.1037   Mean   :-0.1303   Mean   :-0.01862  
#>  3rd Qu.: 0.70175   3rd Qu.: 0.6844   3rd Qu.: 0.7136   3rd Qu.: 0.82202  
#>  Max.   : 2.41576   Max.   : 2.4383   Max.   : 2.6946   Max.   : 2.29670  
#>  NA's   :19         NA's   :17        NA's   :25        NA's   :14        
#>       tr8               tr9                tr10         
#>  Min.   :-2.2707   Min.   :-2.13225   Min.   :-2.70291  
#>  1st Qu.:-0.7567   1st Qu.:-0.85332   1st Qu.:-0.64328  
#>  Median : 0.3727   Median :-0.27197   Median : 0.28762  
#>  Mean   : 0.1409   Mean   : 0.01169   Mean   : 0.04912  
#>  3rd Qu.: 0.9071   3rd Qu.: 0.82314   3rd Qu.: 0.77875  
#>  Max.   : 2.9302   Max.   : 2.98643   Max.   : 1.98637  
#>  NA's   :14        NA's   :24         NA's   :18

# testing the completeness plot with a 10 trait case
fb_plot_number_traits_by_species(species_10traitsNA, threshold_species_proportion = 0.8)

# => no value for >1 to >5 traits (instead of 100%) and 0 not shown at all

fb_plot_number_traits_by_species(species_10traitsNA[,1:7], threshold_species_proportion = 0.8)

# => no value for >=1 and >=0 missing

fb_plot_number_traits_by_species(species_10traitsNA[,1:4], threshold_species_proportion = 0.8)

# => >=0 shown as in tutorial example

Created on 2024-03-18 with reprex v2.1.0

Session info ``` r sessioninfo::session_info() #> ─ Session info ─────────────────────────────────────────────────────────────── #> setting value #> version R version 4.3.1 (2023-06-16 ucrt) #> os Windows 11 x64 (build 22631) #> system x86_64, mingw32 #> ui RTerm #> language (EN) #> collate French_France.utf8 #> ctype fr_FR.UTF-8 #> tz Europe/Paris #> date 2024-03-18 #> pandoc 3.1.1 @ C:/Program Files/RStudio/resources/app/bin/quarto/bin/tools/ (via rmarkdown) #> #> ─ Packages ─────────────────────────────────────────────────────────────────── #> package * version date (UTC) lib source #> class 7.3-22 2023-05-03 [2] CRAN (R 4.3.1) #> classInt 0.4-10 2023-09-05 [1] CRAN (R 4.3.1) #> cli 3.6.2 2023-12-11 [1] CRAN (R 4.3.2) #> colorspace 2.1-0 2023-01-23 [1] CRAN (R 4.3.1) #> curl 5.2.1 2024-03-01 [1] CRAN (R 4.3.3) #> DBI 1.2.2 2024-02-16 [1] CRAN (R 4.3.3) #> digest 0.6.34 2024-01-11 [1] CRAN (R 4.3.1) #> dplyr 1.1.4 2023-11-17 [1] CRAN (R 4.3.2) #> e1071 1.7-14 2023-12-06 [1] CRAN (R 4.3.2) #> evaluate 0.23 2023-11-01 [1] CRAN (R 4.3.2) #> fansi 1.0.6 2023-12-08 [1] CRAN (R 4.3.2) #> farver 2.1.1 2022-07-06 [1] CRAN (R 4.3.1) #> fastmap 1.1.1 2023-02-24 [1] CRAN (R 4.3.1) #> fs 1.6.3 2023-07-20 [1] CRAN (R 4.3.1) #> funbiogeo * 0.0.0.9000 2024-02-12 [1] Github (FRBCesab/funbiogeo@7b97ab9) #> generics 0.1.3 2022-07-05 [1] CRAN (R 4.3.1) #> ggplot2 3.5.0 2024-02-23 [1] CRAN (R 4.3.3) #> glue 1.7.0 2024-01-09 [1] CRAN (R 4.3.2) #> gtable 0.3.4 2023-08-21 [1] CRAN (R 4.3.1) #> highr 0.10 2022-12-22 [1] CRAN (R 4.3.1) #> htmltools 0.5.7 2023-11-03 [1] CRAN (R 4.3.2) #> KernSmooth 2.23-22 2023-07-10 [1] CRAN (R 4.3.2) #> knitr 1.45 2023-10-30 [1] CRAN (R 4.3.2) #> labeling 0.4.3 2023-08-29 [1] CRAN (R 4.3.1) #> lifecycle 1.0.4 2023-11-07 [1] CRAN (R 4.3.2) #> magrittr 2.0.3 2022-03-30 [1] CRAN (R 4.3.1) #> munsell 0.5.0 2018-06-12 [1] CRAN (R 4.3.1) #> pillar 1.9.0 2023-03-22 [1] CRAN (R 4.3.1) #> pkgconfig 2.0.3 2019-09-22 [1] CRAN (R 4.3.1) #> proxy 0.4-27 2022-06-09 [1] CRAN (R 4.3.1) #> purrr 1.0.2 2023-08-10 [1] CRAN (R 4.3.1) #> R.cache 0.16.0 2022-07-21 [1] CRAN (R 4.3.2) #> R.methodsS3 1.8.2 2022-06-13 [1] CRAN (R 4.3.1) #> R.oo 1.26.0 2024-01-24 [1] CRAN (R 4.3.2) #> R.utils 2.12.3 2023-11-18 [1] CRAN (R 4.3.2) #> R6 2.5.1 2021-08-19 [1] CRAN (R 4.3.1) #> Rcpp 1.0.12 2024-01-09 [1] CRAN (R 4.3.2) #> reprex 2.1.0 2024-01-11 [1] CRAN (R 4.3.1) #> rlang 1.1.3 2024-01-10 [1] CRAN (R 4.3.2) #> rmarkdown 2.26 2024-03-05 [1] CRAN (R 4.3.3) #> rstudioapi 0.15.0 2023-07-07 [1] CRAN (R 4.3.1) #> scales 1.3.0 2023-11-28 [1] CRAN (R 4.3.2) #> sessioninfo 1.2.2 2021-12-06 [1] CRAN (R 4.3.1) #> sf 1.0-15 2023-12-18 [1] CRAN (R 4.3.2) #> styler 1.10.2 2023-08-29 [1] CRAN (R 4.3.2) #> tibble 3.2.1 2023-03-20 [1] CRAN (R 4.3.1) #> tidyselect 1.2.1 2024-03-11 [1] CRAN (R 4.3.3) #> units 0.8-5 2023-11-28 [1] CRAN (R 4.3.2) #> utf8 1.2.4 2023-10-22 [1] CRAN (R 4.3.2) #> vctrs 0.6.5 2023-12-01 [1] CRAN (R 4.3.2) #> withr 3.0.0 2024-01-16 [1] CRAN (R 4.3.2) #> xfun 0.41 2023-11-01 [1] CRAN (R 4.3.2) #> xml2 1.3.6 2023-12-04 [1] CRAN (R 4.3.2) #> yaml 2.3.8 2023-12-11 [1] CRAN (R 4.3.2) #> #> [1] C:/Users/greniem/AppData/Local/R/win-library/4.3 #> [2] C:/Program Files/R/R-4.3.1/library #> #> ────────────────────────────────────────────────────────────────────────────── ```

This bug was reported by Sébastien Villéger through our beta-test feedback form.

We have two choices to change this behavior: either we change the y-axis to have the bottom category being the one covering 100% of species, or we populate the graph for all intermediate categories.

I would prefer the former option as it avoids repeating the same line over and over again and making the chart less informative.

Also adding a slightly similar edge case:

set.seed(20240308)

library("funbiogeo")

traits = data.frame(
  species = paste0("sp", seq(nrow(50))),
  trait_1 = rnorm(50),
  trait_2 = rexp(50),
  trait_3 = sample(letters, 50, replace = TRUE)
)

traits[1,] = NA

fb_plot_number_traits_by_species(traits)

Created on 2024-03-18 with reprex v2.1.0

Session info ``` r sessioninfo::session_info() #> ─ Session info ─────────────────────────────────────────────────────────────── #> setting value #> version R version 4.3.1 (2023-06-16 ucrt) #> os Windows 11 x64 (build 22631) #> system x86_64, mingw32 #> ui RTerm #> language (EN) #> collate French_France.utf8 #> ctype fr_FR.UTF-8 #> tz Europe/Paris #> date 2024-03-18 #> pandoc 3.1.1 @ C:/Program Files/RStudio/resources/app/bin/quarto/bin/tools/ (via rmarkdown) #> #> ─ Packages ─────────────────────────────────────────────────────────────────── #> package * version date (UTC) lib source #> class 7.3-22 2023-05-03 [2] CRAN (R 4.3.1) #> classInt 0.4-10 2023-09-05 [1] CRAN (R 4.3.1) #> cli 3.6.2 2023-12-11 [1] CRAN (R 4.3.2) #> colorspace 2.1-0 2023-01-23 [1] CRAN (R 4.3.1) #> curl 5.2.1 2024-03-01 [1] CRAN (R 4.3.3) #> DBI 1.2.2 2024-02-16 [1] CRAN (R 4.3.3) #> digest 0.6.34 2024-01-11 [1] CRAN (R 4.3.1) #> dplyr 1.1.4 2023-11-17 [1] CRAN (R 4.3.2) #> e1071 1.7-14 2023-12-06 [1] CRAN (R 4.3.2) #> evaluate 0.23 2023-11-01 [1] CRAN (R 4.3.2) #> fansi 1.0.6 2023-12-08 [1] CRAN (R 4.3.2) #> farver 2.1.1 2022-07-06 [1] CRAN (R 4.3.1) #> fastmap 1.1.1 2023-02-24 [1] CRAN (R 4.3.1) #> fs 1.6.3 2023-07-20 [1] CRAN (R 4.3.1) #> funbiogeo * 0.0.0.9000 2024-02-12 [1] Github (FRBCesab/funbiogeo@7b97ab9) #> generics 0.1.3 2022-07-05 [1] CRAN (R 4.3.1) #> ggplot2 3.5.0 2024-02-23 [1] CRAN (R 4.3.3) #> glue 1.7.0 2024-01-09 [1] CRAN (R 4.3.2) #> gtable 0.3.4 2023-08-21 [1] CRAN (R 4.3.1) #> highr 0.10 2022-12-22 [1] CRAN (R 4.3.1) #> htmltools 0.5.7 2023-11-03 [1] CRAN (R 4.3.2) #> KernSmooth 2.23-22 2023-07-10 [1] CRAN (R 4.3.2) #> knitr 1.45 2023-10-30 [1] CRAN (R 4.3.2) #> labeling 0.4.3 2023-08-29 [1] CRAN (R 4.3.1) #> lifecycle 1.0.4 2023-11-07 [1] CRAN (R 4.3.2) #> magrittr 2.0.3 2022-03-30 [1] CRAN (R 4.3.1) #> munsell 0.5.0 2018-06-12 [1] CRAN (R 4.3.1) #> pillar 1.9.0 2023-03-22 [1] CRAN (R 4.3.1) #> pkgconfig 2.0.3 2019-09-22 [1] CRAN (R 4.3.1) #> proxy 0.4-27 2022-06-09 [1] CRAN (R 4.3.1) #> purrr 1.0.2 2023-08-10 [1] CRAN (R 4.3.1) #> R.cache 0.16.0 2022-07-21 [1] CRAN (R 4.3.2) #> R.methodsS3 1.8.2 2022-06-13 [1] CRAN (R 4.3.1) #> R.oo 1.26.0 2024-01-24 [1] CRAN (R 4.3.2) #> R.utils 2.12.3 2023-11-18 [1] CRAN (R 4.3.2) #> R6 2.5.1 2021-08-19 [1] CRAN (R 4.3.1) #> Rcpp 1.0.12 2024-01-09 [1] CRAN (R 4.3.2) #> reprex 2.1.0 2024-01-11 [1] CRAN (R 4.3.1) #> rlang 1.1.3 2024-01-10 [1] CRAN (R 4.3.2) #> rmarkdown 2.26 2024-03-05 [1] CRAN (R 4.3.3) #> rstudioapi 0.15.0 2023-07-07 [1] CRAN (R 4.3.1) #> scales 1.3.0 2023-11-28 [1] CRAN (R 4.3.2) #> sessioninfo 1.2.2 2021-12-06 [1] CRAN (R 4.3.1) #> sf 1.0-15 2023-12-18 [1] CRAN (R 4.3.2) #> styler 1.10.2 2023-08-29 [1] CRAN (R 4.3.2) #> tibble 3.2.1 2023-03-20 [1] CRAN (R 4.3.1) #> tidyselect 1.2.1 2024-03-11 [1] CRAN (R 4.3.3) #> units 0.8-5 2023-11-28 [1] CRAN (R 4.3.2) #> utf8 1.2.4 2023-10-22 [1] CRAN (R 4.3.2) #> vctrs 0.6.5 2023-12-01 [1] CRAN (R 4.3.2) #> withr 3.0.0 2024-01-16 [1] CRAN (R 4.3.2) #> xfun 0.41 2023-11-01 [1] CRAN (R 4.3.2) #> xml2 1.3.6 2023-12-04 [1] CRAN (R 4.3.2) #> yaml 2.3.8 2023-12-11 [1] CRAN (R 4.3.2) #> #> [1] C:/Users/greniem/AppData/Local/R/win-library/4.3 #> [2] C:/Program Files/R/R-4.3.1/library #> #> ────────────────────────────────────────────────────────────────────────────── ```

Here, only one species has missing traits but all others (49) have 3 known traits. So it should display for all intermediate categories 98% of species, but it doesn't.