Closed juhoinkinen closed 1 year ago
Patch coverage: 100.00
% and project coverage change: +0.03
:tada:
Comparison is base (
3451bd3
) 99.63% compared to head (e6e06b8
) 99.66%.
:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Do you have feedback about the report comment? Let us know in this issue.
The rdflib import could be turned to lazy, but the code is not super clean and the time saving is only ~0.05 s.
However I would still add it. For the command completions even such a tiny improvement in the response time is noticeable I think.
Seems that upgrading to soon-to-be-released Connexion 3 requires some changes to the way application is created, see https://github.com/NatLibFi/Annif/issues/689#issuecomment-1527508776. Hopefully the changes can be incorporated to this.
I see that Codecov complains about cli.py line 31 not being covered by tests, which I found puzzling, since you added a test that runs
annif run --help
which should execute this line. But then I added some debugging code around that line and found out that the test doesn't actually exercise that line at all. I think that's because when using CliRunner (e.g.runner.invoke(annif.cli.cli, ["run", "--help"])
) that doesn't changesys.argv
at all, so it also won't execute line 31 of cli.py. Is there anything we could do here to make sure that the test also exercises this path?
The line could definitely be run in a test such as
def test_run_separate_process():
result = os.popen("python annif/cli.py run --help").read()
assert "Run a local development server." in result
But I don't know how to ensure this this really starts Flask with Connexion, and not just Flask... And codecov does not take this kind of test into account in coverage counting (at least locally run).
The os.popen
solution doesn't seem very useful, for the reasons you mention.
How about using some other information than just sys.argv[1]
in cli.py
when deciding whether to choose Connexion or plain Flask?
For testing parallel.py
using different spawn methods, I added a constant definition like this: https://github.com/NatLibFi/Annif/blob/3451bd351273e3425a2bea683f8ac2bf3618278b/annif/parallel.py#L7-L10
The unit test that exercises this code overrides the value using monkeypatching: https://github.com/NatLibFi/Annif/blob/3451bd351273e3425a2bea683f8ac2bf3618278b/tests/test_cli.py#L887-L900
Something similar could perhaps be done here - define a constant in cli.py which uses sys.argv and then use it in the if
clause:
USE_CONNEXION = len(sys.argv) > 1 and sys.argv[1] == "run"
if USE_CONNEXION:
create_app = annif.create_app # Use Flask with Connexion
else:
# Connexion is not needed for most CLI commands, use plain Flask
create_app = annif.create_flask_app
Then you can monkeypatch USE_CONNEXION in the unit test to force it to be True
in any case.
...except now that I think of it, I'm not sure if that will work, because monkeypatching is probably applied too late for this to have an effect. Anyway, something similar could work here.
Kudos, SonarCloud Quality Gate passed!
0 Bugs
0 Vulnerabilities
0 Security Hotspots
0 Code Smells
No Coverage information
0.0% Duplication
I see that Codecov complains about cli.py line 31 not being covered by tests, which I found puzzling, since you added a test that runs
annif run --help
which should execute this line. But then I added some debugging code around that line and found out that the test doesn't actually exercise that line at all. I think that's because when using CliRunner (e.g.runner.invoke(annif.cli.cli, ["run", "--help"])
) that doesn't changesys.argv
at all, so it also won't execute line 31 of cli.py. Is there anything we could do here to make sure that the test also exercises this path?The line could definitely be run in a test such as
def test_run_separate_process(): result = os.popen("python annif/cli.py run --help").read() assert "Run a local development server." in result
But I don't know how to ensure this this really starts Flask with Connexion, and not just Flask... And codecov does not take this kind of test into account in coverage counting (at least locally run).
Actually using the os.popen
solution does get the line 31 covered in codecov (when run in CI). When I locally ran coverage run -m pytest tests/test_cli.py; coverage report -m | grep annif/cli.py
that line was reported as missed.
And I made also annif routes
command to start the Connexion app, which registers all REST API endpoints. The list of the registered endpoints from the output of routes
command is now tested to confirm that Connexion is started.
I think this can be merged. When upgrading to Connexion 3 (#702) this functionality can or may need to be changed again.
Optimizes CLI startup time by avoiding unnecessary imports for basic commands, mainly to improve the CLI command completion response times in #693.
Changes:
Imports of joblib and numpy are simply moved inside the functions/methods where they are used.
Import of connexion is avoided for all CLI commands but
annif run
andannif routes
by using separate creation scripts for plain-Flask and Connexion-Flask applications. When started viagunicorn "annif:create_app()
the Connexion-Flask app is created as previously.This is a continuation of the optimizations in #514, #544 and #563.
I checked the import time of the pre-release of connexion 3, and it was even slower (~0.35 s) than of the current v2.14.* (0.12 s), so avoiding connexion import is even more valuable in the future.
Improvement compared to current main The improvement seems to be some 0.3 seconds (~0.1 s coming from each of joblib, numpy, and connexion).
Before (main):
After (PR):
The CLI command completion response times seemed to follow times of
annif --help
, so they are improved similarly.Timing imports for completion responses does not show any easy targets for further optimization. The rdflib is used via multiple modules and in nearly all methods of
corpus.subject.SubjectFileSKOS
, and I don't know how rdflib import could be done only when needed in there in a nice way.Tuna output for the timing of
list-*
completion response time: