conveyal / analysis-backend

Server component of Conveyal Analysis
http://conveyal.com/analysis
MIT License
23 stars 12 forks source link

Backend endpoints fail due to empty database responses #218

Closed abyrd closed 5 years ago

abyrd commented 5 years ago

If there is a worker running that is using a network for which the bundle has been deleted, the backend fails to report any active workers.

This can happen if you accidentally start a worker on an old network using a bundle you don't really want anymore. You then realize the mistake and delete the bundle, but the worker stays alive or continues to be reported.

This appears to be due to the line bundle = Persistence.bundles.find(QueryBuilder.start("_id").is(networkId).get()).next(); in BrokerController#getAllWorkers which assumes every network has a bundle. It causes a null pointer exception that interrupts reporting any workers.

See also #166, which may already be resolved.

abyrd commented 5 years ago

I'm seeing other similar errors where the backend assumes a result contains at least one element. In this case it's failing on a request to delete a regional analysis, which seems to be a completely valid requsest:

16:19:14.780 [qtp1739046130-18] ERROR com.conveyal.taui.AnalysisServer - RUNTIME java.util.NoSuchElementException -> DELETE /api/regional/5ccdb32c32b98e3e23065bde by dominik.sieger@sbb.ch of sbb
16:19:14.780 [qtp1739046130-18] ERROR com.conveyal.taui.AnalysisServer - null
java.util.NoSuchElementException
        at com.mongodb.DBCursor.next(DBCursor.java:169)
        at org.mongojack.DBCursor.next(DBCursor.java:342)
        at com.conveyal.taui.controllers.RegionalAnalysisController.deleteRegionalAnalysis(RegionalAnalysisController.java:78)
        at spark.ResponseTransformerRouteImpl$1.handle(ResponseTransformerRouteImpl.java:47)
        at spark.http.matching.Routes.execute(Routes.java:61)
        at spark.http.matching.MatcherFilter.doFilter(MatcherFilter.java:130)
        at spark.embeddedserver.jetty.JettyHandler.doHandle(JettyHandler.java:50)
        at org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:1568)
        at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:141)
        at org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:132)
        at org.eclipse.jetty.server.Server.handle(Server.java:564)
        at org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.java:317)
        at org.eclipse.jetty.server.HttpConnection.onFillable(HttpConnection.java:251)
        at org.eclipse.jetty.io.AbstractConnection$ReadCallback.succeeded(AbstractConnection.java:279)
        at org.eclipse.jetty.io.FillInterest.fillable(FillInterest.java:110)
        at org.eclipse.jetty.io.ChannelEndPoint$2.run(ChannelEndPoint.java:124)
        at org.eclipse.jetty.util.thread.Invocable.invokePreferred(Invocable.java:128)
        at org.eclipse.jetty.util.thread.Invocable$InvocableExecutor.invoke(Invocable.java:222)
        at org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.doProduce(EatWhatYouKill.java:294)
        at org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.run(EatWhatYouKill.java:199)
        at org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:673)
        at org.eclipse.jetty.util.thread.QueuedThreadPool$2.run(QueuedThreadPool.java:591)
        at java.lang.Thread.run(Thread.java:748)
trevorgerhardt commented 5 years ago

Similarly, should we prevent users from deleting resources related to running Regional Analyses?

abyrd commented 5 years ago

There are actually two distinct problems here. The first problem is where we allow deletion of objects when other objects might still be using or referencing them. Ideally yes, we'd just prevent anything from being deleted if something else was still referencing it. But the simplest line of defense is to just check that database results are non-empty (which in Java can be as simple as iterating over them, and if it's empty the loop body is never executed). In many cases this will lead to some field being left empty, or we could throw an error saying that the referenced resource appears to have been deleted.

The second problem above was a database query that could never succeed - I think it was a simple copy-paste problem. The query was looking for a region API URL parameter that just didn't exist.

trevorgerhardt commented 5 years ago

We haven't seen this problem in awhile, commit seems to have resolved it.