NIEHS / PrestoGP

Penalized Regression on Spatiotemporal Outcomes using Gaussian Processes a.k.a. PrestoGP
https://niehs.github.io/PrestoGP/
0 stars 0 forks source link

Fix tests. Fix linting issues. Added release action. #46

Closed sciome-bot closed 5 months ago

kyle-messier commented 5 months ago

@sciome-bot @shail-choksi @ericbair-sciome Do y'all want to merge this pull request into main now or wait until other checks pass?

codecov[bot] commented 5 months ago

Welcome to Codecov :tada:

Once merged to your default branch, Codecov will compare your coverage reports and display the results in this comment.

Thanks for integrating Codecov - We've got you covered :open_umbrella:

ericbair-sciome commented 5 months ago

@sciome-bot @shail-choksi @ericbair-sciome Do y'all want to merge this pull request into main now or wait until other checks pass?

It looks like all tests are passing now, so please go ahead and merge. (And for the record, I checked the coverage, and essentially all of the code not covered by tests is for features that are not fully implemented yet, so that's encouraging.) R CMD check is still failing, but I'm going to work on fixing that next.

kyle-messier commented 5 months ago

Do we know why the test-coverage is taking ~2 hours to run? Is it fitting GPs?

ericbair-sciome commented 5 months ago

Do we know why the test-coverage is taking ~2 hours to run? Is it fitting GPs?

Yeah, the multivariate regression models have high variance, particularly for the covariance parameter estimates. The only way I could be confident that they were producing accurate results is by fitting a model with a large sample size, which takes some time. On my machine (which is fairly fast and has an upgraded BLAS), the tests take about 30 minutes to run. Brian said that they were taking well over an hour on his laptop. I'm not sure there is a good alternative, though, since the output of prestogp_fit doesn't seem to be deterministic even if I use set.seed().

Are the tests running on a GitHub server or some local machine? You can speed things up significantly by upgrading BLAS. That may not be possible if the GitHub tests are running on the cloud somewhere. But if you/your postdocs are using Macs, you probably want to replace the default BLAS library with Apple's Accelerate BLAS library. (I can send you a link if you need instructions. It's easy to do.)

kyle-messier commented 5 months ago

@ericbair-sciome Ok - I'm glad an extensive test of the MV regression. However, I think we should highly consider an alternative CI/CA strategy for tests like that. @shail-choksi Could we set up another action that tests the long-run-time functions with another strategy instead of every pull request or merge? For example, once a day or week, or on request? So we have the standard CI/CA tests on say 95% of the code on pull/merge requests and the other strategy on the rest.

shail-choksi commented 5 months ago

@Spatiotemporal-Exposures-and-Toxicology There are 2 options we have:

kyle-messier commented 5 months ago

@Spatiotemporal-Exposures-and-Toxicology There are 2 options we have:

  • We install openmp and increase the number of processors the action runs on to speed up the overall time
  • And/or we split the test actions into full coverage/long running and the 95% coverage/fast

@shail-choksi My thought was to go with option 2, but if you recommend 1, then that is fine too.

ericbair-sciome commented 5 months ago

I don't think that multithreading is going to speed things up significantly here. And it would require some significant rewriting of the multivariate code to use multithreading at all. I would be happier to avoid that for the near future because it would require significant time and effort, and I think the multivariate code should be fast enough to analyze the pesticide data as it is. And the slow tests are responsible for covering at least 70-80% of the code. There is no way to get reasonable coverage if they are removed.

Do you have to pay for computing time on GitHub (or whatever machine runs these tests)? If not, does it even matter if the tests are slow? I doubt we are going to be doing more than 1-2 commits to GitHub per week anyway most of the time. Otherwise the only option I can think of would be to remove the tests verifying that the Matern parameter estimates are accurate. Since it is notoriously difficult to estimate variance/covariance parameters, we need a big sample size to verify that these estimates are working correctly. If we don't care about the Matern parameters and only care about the regression coefficients, a much smaller sample size should be adequate. I am reluctant to do that because I want to know if a change to the code is causing a change to the Matern parameter estimates even if we are primarily concerned with the regression coefficients. But it's an option if speeding up the tests is critical. The only option would be to just give up on running tests on GitHub and just run devtools::test() locally.

ericbair-sciome commented 5 months ago

I guess the other potential solution, as I mentioned earlier, would be to try to upgrade BLAS on whatever machine is running these tests. Using the Apple Accelerate BLAS on my MacBook or OpenBLAS on WSL, the tests only take 20-30 minutes to run.

kyle-messier commented 5 months ago

@ericbair-sciome There is no cost for the github actions since the repo is public. We can leave the tests as is for now, but perhaps change it away from the standard merge/pull request. Do y'all know the other yaml options for [on] (i.e. very beginning of the yaml file)?

shail-choksi commented 5 months ago

Sorry with Option 1 - I meant to say to install BLAS not openmp.

kyle-messier commented 5 months ago

Sorry with Option 1 - I meant to say to install BLAS not openmp.

Okay, it sounds like we should do this for gh-action. Should this be included as a repo dependency too?

shail-choksi commented 5 months ago

@ericbair-sciome There is no cost for the github actions since the repo is public. We can leave the tests as is for now, but perhaps change it away from the standard merge/pull request. Do y'all know the other yaml options for [on] (i.e. very beginning of the yaml file)?

i have used other options for triggering the workflows - what did you want to change it to? Some of the relevant options are:

ericbair-sciome commented 5 months ago

I would be inclined to upgrade BLAS and go from there. Hopefully that speeds things up to make it a non-issue. Having said that, if we don't want to run tests on every single commit/merge, I don't see that as a major issue. I always run devtools::test() before I commit anything anyway (and I strongly encourage others to do the same). The coverage information on GitHub is useful for me, but that isn't something that needs to be updated super often. Otherwise I don't think it's telling me anything that I wouldn't already get from devtools::test() and devtools::check().

kyle-messier commented 5 months ago

@ericbair-sciome @shail-choksi Sounds good - let's just update BLAS and let the actions be for time being. If it runs a long-time, it's not hurting anything at the moment.

shail-choksi commented 5 months ago

@Spatiotemporal-Exposures-and-Toxicology great - I will start working on that then.

Also, for the release action, like I mentioned earlier, I have set it to run on a new tag push. As an example I pushed a tag that ran this: https://github.com/Spatiotemporal-Exposures-and-Toxicology/PrestoGP/actions/runs/7509148264

That action does the following:

releases are listed here: https://github.com/Spatiotemporal-Exposures-and-Toxicology/PrestoGP/releases and the example run created this: https://github.com/Spatiotemporal-Exposures-and-Toxicology/PrestoGP/releases/tag/v0.0.1

Also, to be consistent, we should name the tags the same as the version of the package.

Let me know if you have any changes/concerns about the process above. If not, I will go ahead and delete the tag and release.

shail-choksi commented 5 months ago

@Spatiotemporal-Exposures-and-Toxicology @ericbair-sciome

I tried speeding up the tests but am unable to do so

If we want to have a decent code coverage, I don't think we can remove the tests or have a small subset.

Let me know what you both think and what we should implement

kyle-messier commented 5 months ago

@ericbair-sciome @shail-choksi We can keep the tests as is for now - coverage is more important. We (NIEHS) is in the process of moving to the NIH GitHub Enterprise, so we can try different runners when that happens.

ericbair-sciome commented 4 months ago

One interesting (and rather weird) thing I noticed as I have watched this latest update in real time: It looks like the testthat tests are being run twice: once during the R-CMD-check check and once during the test-coverage check. But the R-CMD-check test is much faster. It says that the entire R-CMD-check process took 32 minutes (which includes the testthat tests). The test-coverage version of the tests is still running (almost an hour later). So maybe the covr coverage testing is slowing down the tests? I'm just speculating. As I said, it's probably not worth investing too much time or energy fixing this, but I thought it was worth mentioning. Tagging @shail-choksi just in case he has any idea what might be causing this.