conveyal / analysis-backend

Server component of Conveyal Analysis
http://conveyal.com/analysis
MIT License
23 stars 12 forks source link

Partial regional results #181

Closed abyrd closed 5 years ago

abyrd commented 5 years ago

This PR makes it possible to fetch regional analysis results when the analysis is not finished yet. In discussion with @trevorgerhardt we came to the conclusion that this would be useful.

The existing endpoint for complete results at "/api/regional/:_id/grid/:format" will detect whether the analysis is complete or incomplete, and will return the grid directly if it's incomplete, otherwise it returns a "wrapped URL" redirect to s3.

This will require changes to the UI since unfinished results are streamed back from a file immediately, not as a redirect. It would also be possible to just have two different HTTP endpoints, one for finished and one for unfinished results.

I am pushing this branch for preliminary review and feedback from @trevorgerhardt, so the branch is not forgotten. I can update the description with more detail once we get a chance to talk about it. It probably needs a little more work before it's really usable.

This PR also moves GridResultAssembler and SelectingGridReducer into the backend from R5 since they are never used directly in R5, and are essentially part of backend operation.

It also removes the ResultAssembler objects when a job is complete - previously it seems they were left in memory forever.

codecov-io commented 5 years ago

Codecov Report

Merging #181 into dev will decrease coverage by 1.56%. The diff coverage is 1.87%.

Impacted file tree graph

@@             Coverage Diff             @@
##               dev     #181      +/-   ##
===========================================
- Coverage     25.6%   24.03%   -1.57%     
  Complexity     103      103              
===========================================
  Files           58       60       +2     
  Lines         2160     2305     +145     
  Branches       195      211      +16     
===========================================
+ Hits           553      554       +1     
- Misses        1573     1717     +144     
  Partials        34       34
Impacted Files Coverage Δ Complexity Δ
...c/main/java/com/conveyal/taui/util/WrappedURL.java 0% <ø> (ø) 0 <0> (ø) :arrow_down:
...conveyal/taui/analysis/RegionalAnalysisStatus.java 0% <ø> (ø) 0 <0> (ø) :arrow_down:
...java/com/conveyal/taui/analysis/broker/Broker.java 12.57% <0%> (-0.41%) 4 <0> (ø)
...n/java/com/conveyal/taui/SelectingGridReducer.java 0% <0%> (ø) 0 <0> (?)
...in/java/com/conveyal/taui/GridResultAssembler.java 0% <0%> (ø) 0 <0> (?)
...l/taui/controllers/RegionalAnalysisController.java 10% <10%> (-0.41%) 2 <0> (ø)

Continue to review full report at Codecov.

Legend - Click here to learn more Δ = absolute <relative> (impact), ø = not affected, ? = missing data Powered by Codecov. Last update a9fa7c3...bf2a54b. Read the comment docs.

abyrd commented 5 years ago

Based on recent discussion with @trevorgerhardt and @ansoncfit this works the way we want it to - a single endpoint that returns a redirect code with application/json (if the job is finished) or a 202 with application/octet-stream if the job is still running, with partial results. Ready for review and merge.

abyrd commented 5 years ago

Currently you'll get a 302 redirect with type text/plain for completed jobs, and a 200 OK with type application/octet-stream for incomplete jobs. This does allow distinguishing between them but it's kind of weird to give the "normal" OK status code when the job isn't finished. We could change it to 206 but that's kind of abuse of notation. Practically speaking, it should work in current form.

abyrd commented 5 years ago

The partial results fetching works in local testing. I get an ACCESSGR file while the analysis is running, and I get a 404 once I've deleted the job. I can't test S3 redirects because my completed local analyses are not uploading to S3 (region weirdness), but it appears to work exactly like before so should be good.

One thing I notice is that we're always sending URLs ending with /grid?redirect=false then returning redirects anyway. Does anyone know what this boolean redirect parameter is for? It's ignored in the backend code.

abyrd commented 5 years ago

This PR's branch is now caught up with dev, so now includes proxy-content-type. We should test this branch on staging.