Closed wholmgren closed 4 years ago
It's the API calls that make the iotools take long. I assume you're making that consideration for its impact on Azure Pipelines right? Locally, you can split tests into different processes to speed it up locally but that's not usable on a free version of Azure Pipelines.
I explored parallelism for Azure Pipelines briefly when I was coming on board to helping admin pvlib. Azure Pipelines requires a paid account to enable more processes and even then I think it splits different test jobs rather than providing another core for the primary process. Maybe there's an async solution out there that could work to run something local while waiting for the API call but it'll have to be a turnkey solution to be worth the effort if it exists. Probably the only thing we could do at this point is minimize the amount of API calls unless.
One idea is to mock the network calls by default [1] but leave the option to perform them when specifically requested [2].
[1] https://requests-mock.readthedocs.io/en/latest/pytest.html [2] https://docs.pytest.org/en/2.9.1/example/simple.html#control-skipping-of-tests-according-to-command-line-option
(edit -- the [2] link above is super out of date, not sure why it showed up for me on google. a little irrelevant since pytest-remotedata does the same thing anyway)
I have two suggestions, in addition to Kevin's idea.
Don't run the tests that have API calls for PR commits, only on commits to master, or even just for releases. These tests are exactly the same every time, we're not changing inputs, so like Cameron and Kevin say, they're mainly testing the API, has it changed? So I think we only need to check this once in a while.
Split off iotools
into a separate repo and package, call it pvlib.iotools
, and make a dependency of pvlib
so that it installs automatically as if it were already part of pvlib, similar to Sphinx extensions.
@CameronTStark I'm concerned about both CI speed and local performance. A lot of people seem to rely on the CI, so faster feedback is better for them. I typically run tests locally before pushing to github, starting with the relevant module and then the whole suite. I usually forget to install and use pytest-xdist
.
I'm in favor of running only the live network tests on one CI instance. I think that's a common pattern for other pydata projects. I'm also in favor of mocking calls where appropriate.
My original hope was that we could simply request less data in some of these calls.
I'm -1 on splitting iotools
into a separate package at this point, but open to it in the future. I think it's too much overhead for now.
I'm -1 on splitting
iotools
into a separate package at this point, but open to it in the future. I think it's too much overhead for now.
I was a little scared of that too, but just mentioned it as a remote possibility, nuclear option, thx
I'm in favor of running only the live network tests on one CI instance. I think that's a common pattern for other pydata projects. I'm also in favor of mocking calls where appropriate.
I think a wiki or doc page in contribution guidelines (maybe I missed this) to clarify expected testing patterns and recommended PR test procedures could be helpful to align everyone
@wholmgren now that pytest-remote data has been implemented are you satisfied with the speed or do you see more room for improvement?
Yes, I'm happy with the speed when the remote data tests are excluded. I don't think it's worth the effort to make the remote tests faster at this point in time. Thanks everyone here for your help and ideas.
The test suite is getting to be quite slow. Here are the 20 slowest tests on my mac:
I will work on the
forecast
tests. Any ideas for theiotools
tests?