conveyal / analysis-backend

Server component of Conveyal Analysis
http://conveyal.com/analysis
MIT License
23 stars 12 forks source link

Don't signal retry for requests that timed out #238

Closed ansoncfit closed 5 years ago

ansoncfit commented 5 years ago

As discussed in #222, analyses that take longer than the timeout will start piling up because the front-end keeps retrying. The changes in this PR warn that the requested analysis was too complex, rather than signalling the front-end to retry.

To test, run an analysis that exceeds the timeout (e.g. a large number of simulated schedules with bike egress on a complex network) and see if the expected warning is displayed in the front-end.

codecov-io commented 5 years ago

Codecov Report

Merging #238 into dev will increase coverage by 0.31%. The diff coverage is 0%.

Impacted file tree graph

@@             Coverage Diff              @@
##                dev     #238      +/-   ##
============================================
+ Coverage     22.75%   23.07%   +0.31%     
  Complexity      103      103              
============================================
  Files            63       62       -1     
  Lines          2430     2405      -25     
  Branches        220      216       -4     
============================================
+ Hits            553      555       +2     
+ Misses         1842     1817      -25     
+ Partials         35       33       -2
Impacted Files Coverage Δ Complexity Δ
...c/main/java/com/conveyal/taui/util/HttpStatus.java 0% <ø> (ø) 0 <0> (ø) :arrow_down:
...om/conveyal/taui/controllers/BrokerController.java 20.18% <0%> (+0.88%) 6 <0> (ø) :arrow_down:
...l/taui/controllers/RegionalAnalysisController.java 9.15% <0%> (-0.07%) 2% <0%> (ø)
...java/com/conveyal/taui/analysis/broker/Broker.java 15.07% <0%> (ø) 4% <0%> (ø) :arrow_down:
.../conveyal/taui/analysis/broker/RedeliveryTest.java 0% <0%> (ø) 0% <0%> (ø) :arrow_down:
.../com/conveyal/taui/persistence/OSMPersistence.java 0% <0%> (ø) 1% <0%> (ø) :arrow_down:
.../taui/analysis/broker/EC2RequestConfiguration.java 0% <0%> (ø) 0% <0%> (ø) :arrow_down:
...in/java/com/conveyal/taui/analysis/broker/Job.java 0% <0%> (ø) 0% <0%> (ø) :arrow_down:
...java/com/conveyal/taui/models/AnalysisRequest.java 0% <0%> (ø) 0% <0%> (ø) :arrow_down:
...com/conveyal/taui/analysis/broker/EC2Launcher.java 0% <0%> (ø) 0% <0%> (ø) :arrow_down:
... and 6 more

Continue to review full report at Codecov.

Legend - Click here to learn more Δ = absolute <relative> (impact), ø = not affected, ? = missing data Powered by Codecov. Last update 225bbbf...e5e048e. Read the comment docs.

trevorgerhardt commented 5 years ago

Approved, but still have a question, is there a reliable way to test this?

abyrd commented 5 years ago

@trevorgerhardt I have just inadvertently tested this - if you kill the worker and then issue another request the connection will time out. This is an accidental test though - my test is really checking whether the backend will de-register the single point worker when it's shut down. This test is timing out trying to connect to the worker. The situation covered by the error message in this PR is relevant where the connection happens, but the response times out. Eventually we'll refine this same code block to give two different messages for the two different kinds of timeouts.

abyrd commented 5 years ago

I expect the true focus of this PR could be tested by setting a huge number of simulated schedules on a very wide geographic area.