funecology / fundiversity

📦 R package to compute functional diversity indices efficiently
https://funecology.github.io/fundiversity/
GNU General Public License v3.0
30 stars 3 forks source link

fd_dis gives error if site x species matrix does not have row names #76

Closed mahonmb closed 10 months ago

mahonmb commented 1 year ago

I ran into an issue using fd_dis() when my site x species matrix doesn't have row names. For my purposes, I cannot force my site x species matrices to have row names as they are contained within a simulation matrix (class simmat). I expect this could be an issue for future users as well.

An easy fix would be to either just force "Site" in the return dataframe from the fd_dis() function to be 1:nrow :

data.frame(site = 1:nrow(sp_com), FDis = fdis_site, row.names = NULL)

or use an if statement

if(is.null(rownames(sp_com))) { rownames(sp_com) <- 1:nrow(sp_com) } data.frame(site = rownames(sp_com), FDis = fdis_site, row.names = NULL)

Rekyt commented 1 year ago

Dear @mahonmb, First of all thank you for pointing out the bug! Second of all, thank you for suggesting a fix!

Indeed, we always thought that the input would be matrix objects but didn't realize other objects would also work with slight modifications.

I'll be fixing this behavior for all functions and include additional tests to cover it.

Rekyt commented 1 year ago

One issue I'm realizing is that when the user provides only a trait dataset and no site-species dataset, the site is automatically named s1 (see below). We could use site names like s N-row for the use case described above.

library("fundiversity")

fd_fdis(traits_birds)
#>   site     FDis
#> 1   s1 133.3902

Created on 2023-08-20 with reprex v2.0.2

Should we use this behavior to be consistent? Or rather revert to site names as indices?

Do you have an opinion @Bisaloo?

mahonmb commented 1 year ago

@Rekyt You're welcome. I ended up modifying fd_fdis() with my first suggestion.

Thank you (and all of your coauthors) for making this package! It has been a huge help in reducing my code run times! I have 9 regions with 10,000 simulated/randomized matrices each, so 90,000 matrices were taking days to run with the FD package! With fd_fdis(), it took an hour or two, at most.

Bisaloo commented 1 year ago

Should we use this behavior to be consistent?

This approach makes the most sense to me.

We just need to make sure that the meaning of s1, s2, etc. is clearly documented. Something like "If the sp_com argument is not provided or if it doesn't have rownames, arbitrary rownames s1, s2, s3 will be used. The @return section of the docs seems like a good place for this. We already document it in the intro vignette but it's always good to document everything in multiple places.

We might also add a message() but I'm more on the fence here. Having too many messages can be overwhelming for users.