plantphys / spectratrait

A tutorial R package for illustrating how to fit, evaluate, and report spectra-trait PLSR models. The package provides functions to enhance the base functionality of the R pls package, identify an optimal number of PLSR components, standardize model validation, and vignette examples that utilize datasets sourced from EcoSIS (ecosis.org)
GNU General Public License v3.0
11 stars 9 forks source link

Adding group stratification to pls_permutation and selection of the number of components #90

Closed asierrl closed 2 years ago

asierrl commented 2 years ago

If ensuring the validation data set has a similar distribution as the calibration data set is important for model validation and is achieved by stratifying the random selection procedure used to split the data set, I think it should equally equally important for permuting the calibration data set to select the optimal number of components. I suggest to improve find_optimal_components by allowing it to stratify the random sampling in each permutation. In case you find it interesting, although I am far from en expert, I have already modified pls_permutation and find_optimal_components to allow for the input of a groups object of the form c("var1", "var2"..., "varn"). I names these functions as pls_permut_by_groups and find_optimal_comp_groups. pls_permut_by_groupsnow seems to perform well. I'm not sure about find_optimal_comp_groups, as I get a the following error: Error: 'pls_permut_by_goups' is not an exported object from 'namespace:spectratrait' Called from: getExportedValue(pkg, name) I understand the issue, but I don't know how to tell getExportedValue that the new function is not in spectratrait. Of course, this should not be an issue if you find the function could be included within spectratrait.

serbinsh commented 2 years ago

@asierrl Thanks! this is indeed a great suggestion and I will add it to our development plan. If you have already drafted new functions, is it possible to either 1) create a new branch of spectratrait, add the functions and provide a PR for me to test/explore or 2) send me an R script with the new function drafts or 3) paste the code in this thread.

I could add them to a PR for us to test and put you down as a function author

asierrl commented 2 years ago

Great, @serbinsh. I forked the repository, created a new branch, uploaded the new files and created a pull request. Not sure if you automatically have permissions for merging it...I guess so.

serbinsh commented 2 years ago

This is being addressed in PR #93

serbinsh commented 2 years ago

This is done