AllenInstitute / datacube

Other
0 stars 1 forks source link

bad request for datacube can cause human mtg service restart #113

Closed shus2018 closed 5 years ago

shus2018 commented 5 years ago

seeing human MTG auto-restarted today. From the log, looks like we are trying to access disabled conn services. In general, we should handle bad request better, bad request for datacube should not cause datacube services to restart, e.g. human mtg service restarted today.

Errors from the log:

"Request <Request at 0x7f04ff450b38 method=GET uri=/mouseconn/data/projection_map/target?seedPoint=6600,5400,4800&dataset=159648854 clientproto=HTTP/1.1> failed with [Failure instance: Traceback: <class 'autobahn.wamp.exception.ApplicationError'>: ApplicationError(error=, args=['no callee registered for procedure '], kwargs={}, enc_algo=None)\n--- ---\n/local1/apps/datacube-builds/DataCube--236/services/legacy_conn_routes/conn_bridge.py:104:mouseconn_spatial\n]." 2018-10-16T12:44:44-0700 [Router 21530] "Request <Request at 0x7f04ff3dd208 method=GET uri=/mouseconn/data/projection_map/target?seedPoint=6600,5400,4800&startRow=25&numRows=50 clientproto=HTTP/1.1> failed with [Failure instance: Traceback: <class 'autobahn.wamp.exception.ApplicationError'>: ApplicationError(error=, args=['no callee registered for procedure '], kwargs={}, enc_algo=None)\n--- ---\n/local1/apps/datacube-builds/DataCube--236/services/legacy_conn_routes/conn_bridge.py:104:mouseconn_spatial\n]." 2018-10-16T14:02:27-0700 [Router 21530] session "4326058382500104" left realm "aibs" 2018-10-16T14:02:27-0700 [Guest 21546] /local1/apps/datacube-builds/DataCube--236/services/datacube/run.sh: line 6: 21549 Killed python -u server.py $@ 2018-10-16T14:02:27-0700 [Guest 21546] Service crashed with exit code 137. Respawning... 2018-10-16T14:02:43-0700 [Guest 21546] 2018-10-16T14:02:43-0700 deleting '/dev/shm/human_mtg_transcriptomics.zarr.lmdb'... 2018-10-16T14:02:43-0700 [Guest 21546] 2018-10-16T14:02:43-0700 cloning '../.././human_mtg_data/human_mtg_transcriptomics.zarr.lmdb' store to '/dev/shm/human_mtg_transcriptomics.zarr.lmdb'... 2018-10-16T14:02:47-0700 [Guest 21546] 2018-10-16T14:02:47-0700 loading '/dev/shm/human_mtg_transcriptomics.zarr.lmdb' zarr LMDBstore as xarray dataset... 2018-10-16T14:02:47-0700 [Guest 21546] 2018-10-16T14:02:47-0700 building indexes... 2018-10-16T14:02:47-0700 [Guest 21546] 2018-10-16T14:02:47-0700 building index for field 'age_days'... 2018-10-16T14:02:47-0700 [Guest 21546] 2018-10-16T14:02:47-0700 building index for field 'age_id'... 2018-10-16T14:02:47-0700 [Guest 21546] 2018-10-16T14:02:47-0700 building index for field 'brain_hemisphere'... 2018-10-16T14:02:47-0700 [Guest 21546] 2018-10-16T14:02:47-0700 building index for field 'brain_hemisphere_id'... 2018-10-16T14:02:47-0700 [Guest 21546] 2018-10-16T14:02:47-0700 building index fo

chrisbarber commented 5 years ago

I did recently fix something where a particular type of request was allocating around 4G ram unnecessarily and is also very slow (fixed in https://github.com/AllenInstitute/datacube/commit/6d534da5e8414031a999e0b2c15b4c70b00208fc). An example request of this form is: curl -H "Content-Type:application/json" -d '{"procedure": "org.brain-map.api.datacube.groupby.human_mtg_transcriptomics", "args": [], "kwargs": {"field":"nucleus", "groupby":["cluster", "brain_subregion"], "agg_func":"size", "sort":["cluster", "brain_subregion"], "ascending":[true,true], "filters":[{"field": "brain_hemisphere", "op": "!=", "value": "R"}]}}' http://tdatacube:8080/call I'm not sure if the web apps team are making these types of requests, but if more than one came in at the same time it could cause the human mtg datacube to get OOM-killed. That's what this looks like to me (the line where it says /local1/apps/datacube-builds/DataCube--236/services/datacube/run.sh: line 6: 21549 Killed python -u server.py $@).

I merged this commit to the release branch yesterday. I haven't heard any complaints about this particular issue from the web apps team, so you probably don't need to redeploy unless you want to, or unless we hear otherwise.

shus2018 commented 5 years ago

Thanks, Chris.