choderalab / perses

Experiments with expanded ensembles to explore chemical space
http://perses.readthedocs.io
MIT License
181 stars 51 forks source link

Triage Tests #764

Open mikemhenry opened 3 years ago

mikemhenry commented 3 years ago

We need to take inventory of our current tests and see which ones we skip on CI (currently both @attr('advanced') and @skipIf(running_on_github_actions, "Skip advanced test on GH Actions")

We need to assess:

We also need to assess where we can improve our test coverage.

mikemhenry commented 3 years ago

Good news, with the way the tests and coverage report are setup, we can also look at the coverage for our tests:

https://codecov.io/gh/choderalab/perses/tree/master/perses/tests

We should be at 100% or at least ensure that our code coverage stays the same even when we run the longer running tests, if possible.

jchodera commented 3 years ago

which tests need GPUs

Perhaps @dominicrufa could help label all these tests with the @attr('gpu_required')?

We should be at 100% or at least ensure that our code coverage stays the same even when we run the longer running tests, if possible.

Can you help me read this? I think we don't want to look at the coverage of methods in the unit test files, but rather the coverage of methods in the actual codebase, right? Like this? image

dominicrufa commented 3 years ago

We need to take inventory of our current tests and see which ones we skip on CI (currently both @attr('advanced') and @skipIf(running_on_github_actions, "Skip advanced test on GH Actions")

We need to assess:

  • which tests are too slow to run on CI
  • which tests need GPUs

i don't think any tests necessarily require GPUs, but perhaps the ones that run slow (presumably the ones that have @skipif) could use a GPU to speed up? I'm not sure.

jchodera commented 3 years ago

i don't think any tests necessarily require GPUs, but perhaps the ones that run slow (presumably the ones that have @skipif) could use a GPU to speed up? I'm not sure.

@dominicrufa : The idea is to mark slow tests that run quickly if a GPU is present. Could you help us identify those?

There may also be slow tests where a GPU won't help speed things up. Those may be marked as slow and identified for performance enhancements later.

dominicrufa commented 3 years ago

let's see. things that would probably be faster with a gpu:

the ones that need to be optimized...

theses test functionality that we don't use:

and i am not sure why this: https://github.com/choderalab/perses/blob/master/perses/tests/test_utils.py has skipped tests (perhaps something to do with openeye)

mikemhenry commented 3 years ago

@jchodera Let me clarify:

There are two different kinds of coverage I'm talking about, coverage of the application and coverage of the tests.

100% coverage for an application would take a lot of effort and really isn't necessary -- A good thing to strive for but dev effort is better spend elsewhere once we have core coverage. When we look at coverage for the application, we want to see all the important bits of the code covered.

100% coverage in this test tree is what we want to be our goal. If coverage isn't 100% here, that means we have written tests that are not running (or helper functions that are not being called). An exception for tests that run long or can't run in the CI environment are an exception. This is an example of what I'm talking about: https://twitter.com/coveragepy/status/1364642550573899781 Looking at test coverage coverage helps to identify tests that we think should be running, but are not.

So looking at both the coverage of the application and of tests/ is helpful. The coverage tells us different things and have different ultimate goals.

@dominicrufa Thanks for this list! I'll start decorating with something like @skipif(NO_GPU) as well as look into why some of the tests we expect to run are not.

For the tests that need optimization, I'll need your help if the fixes are not simple, like running fewer steps or using a smaller system size.

jchodera commented 3 years ago

Some of this will be ameliorated with #853, but we'll still have to mark these tests as slow.