HEPCloud / decisionengine

HEPCloud Decision Engine framework
Apache License 2.0
6 stars 26 forks source link

cherrypy metrics server blocking startup #608

Closed jcpunk closed 2 years ago

jcpunk commented 2 years ago

This should fix the startup blocking issue with cherrypy and add it to the testing framework.

It about doubles the testing time on my system, so I added some more workers to pytest-xdist to try and offset the cost.

codecov[bot] commented 2 years ago

Codecov Report

Merging #608 (4a235e5) into master (ca5ddf6) will increase coverage by 0.56%. The diff coverage is 85.71%.

:exclamation: Current head 4a235e5 differs from pull request most recent head 08ad2c6. Consider uploading reports for the commit 08ad2c6 to get more accurate results

Impacted file tree graph

@@            Coverage Diff             @@
##           master     #608      +/-   ##
==========================================
+ Coverage   95.95%   96.52%   +0.56%     
==========================================
  Files          47       47              
  Lines        2844     2847       +3     
  Branches      451      451              
==========================================
+ Hits         2729     2748      +19     
+ Misses         85       69      -16     
  Partials       30       30              
Flag Coverage Δ
python-3.10 96.20% <85.71%> (+0.53%) :arrow_up:
python-3.6 96.11% <85.71%> (+0.63%) :arrow_up:

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
.../decisionengine/framework/engine/DecisionEngine.py 90.27% <83.33%> (+2.77%) :arrow_up:
src/decisionengine/framework/util/sockets.py 100.00% <100.00%> (ø)
.../decisionengine/framework/engine/ChannelWorkers.py 90.78% <0.00%> (+2.63%) :arrow_up:

Continue to review full report at Codecov.

Legend - Click here to learn more Δ = absolute <relative> (impact), ø = not affected, ? = missing data Powered by Codecov. Last update ca5ddf6...08ad2c6. Read the comment docs.

jcpunk commented 2 years ago

Moved to be earlier, I'd love some thoughts from @shreyb too

mambelli commented 2 years ago

@jcpunk unit tests failed now. Maybe was an intermittent error (crashing the runner?). Can you check or refresh the PR to restart the tests?

shreyb commented 2 years ago

Moved to be earlier, I'd love some thoughts from @shreyb too

I agree with Marco that ideally we'd start the metrics server first, since then we could monitor the starting up of channels and not just their status after they had already started. That being said, if it's causing issues to have the webserver start first, I'm fine with it being last.

As far as the unit tests failing, I did notice that starting the webserver in the unit tests caused the tests to hang (and then fail), but NOT when I actually ran the DE outside the scope of unit tests. For that reason, in the commit https://github.com/HEPCloud/decisionengine/commit/a409f12654d9301784b8cdc813c25a0cd7efaab7, I set the default in the test fixtures to not have the webserver start up.