Closed grovduck closed 1 year ago
Just FYI, I ran these data through yaImpute and created a bunch of new train/test files. Then I clumsily copied test_port.py
into a swo_ecoplot
specific version and all tests passed. I didn't think it was worth adding all of these files just to verify the porting logic, but we may want to keep the yaImpute output around if we end up getting back to #42?
Definitely reassuring to hear that everything passes with a second dataset, but I agree that it's not needed in the test suite. For the connection with #42, are you thinking that we would verify accuracy against yaImpute one last time before we switch to regression testing and ditch the port tests? That sounds reasonable to me!
For the connection with #42, are you thinking that we would verify accuracy against yaImpute one last time before we switch to regression testing and ditch the port tests? That sounds reasonable to me!
I'm not sure what I was thinking, but I did just spend a bit of time playing around with syrupy
and I think I understand the main concepts there. I'll put some follow up thoughts in #42.
@aazuspan. I think this is ready to go, but thought I'd see if you had any issues with the docstring for load_swo_ecoplot
.
LGTM!
This PR would add the R6 southwest Oregon (SWO) Ecology plot dataset to the sample data available in
sknnr
. The dataset consists of a species matrix (percent cover by tree species) and an environmental matrix (climate, topography, and spectral) for 3,005 plots in SWO measured in 2000. The R6 Ecology Program installed these plots and this PR is contingent upon their permission to use these data.This PR adds a new function (
load_swo_ecoplot
) which makes these data available to allsknnr
(andscikit-learn
) estimators.Pending approval, the data inEDIT: Actual data is now committed after approval from R6 Ecology Group and data citation is added.swo_ecoplot_env.csv
andswo_ecoplot_spp.csv
are all dummy values, but do currently pass our test suite.