VEuPathDB / service-eda

Repo containing EDA web service
Apache License 2.0
0 stars 0 forks source link

eda performance: keep track of cases #22

Open aurreco-uga opened 4 months ago

aurreco-uga commented 4 months ago
  1. Omar shares a UD "piggy test 3" with user 376, the analysis https://clinepidb.org/analysis/ltBJrXq , the plot in it hits a proxy timeout.
  2. older from sam: https://vectorbase.org/vectorbase/app/workspace/maps/kW9ZHD3/import
  3. mbiome user User #1063298523 - sweet@usna.edu 20MB upload 1063298523/5b8MuCFDJ2y she cant see the plot of a diff abundance compute
aurreco-uga commented 3 months ago

20MB biom upload, 26 sources, 32MB attrgraph table, 260MB attrvalues table.

Screenshot 2024-05-13 at 7 57 09 AM

.

IMPORT
script completed successfully for dataset 88KWSr3d3om in 543.049 seconds
9 mn
===================
click to EXPLORE (going to eda)
seconds
slow SQLs (142086 rows) attributegraph_z08qjy_1_Sour
==================
CLICK new visualization
long wait (seconds getting filter counts) to get apps
collapses browser
===================
choose diff abund boxplot
COMPUTE 20mn

2024-05-11 13:20:15.121 [rid:     ][jid:6041e03f9602ca46dd7f482758d3f83e][p:alphadiv] DEBUG PluginExecutor:186 - validating plugin StreamSpecs
**??????????????**
2024-05-11 13:29:59.371 [rid:     ][jid:6041e03f9602ca46dd7f482758d3f83e][p:alphadiv] DEBUG PluginExecutor:155 - retrieving tabular study data: alpha_div_input
2024-05-11 13:29:59.372 [rid:     ][jid:6041e03f9602ca46dd7f482758d3f83e][p:alphadiv] INFO  ClientUtil:77 - Will send following POST request to http://localhost:80/merging-internal/query
2024-05-11 13:29:59.638 [rid:2VWri][jid:][p:] INFO  ClientUtil:70 - Will send following GET request to http://localhost:80/studies/88KWSr3d3om
**????????????????**
2024-05-11 13:40:14.617 [rid:2VWri][jid:][p:] DEBUG DependencyResolver:219 - Ordered nodes: [  ]
2024-05-11 13:40:14.618 [rid:2VWri][jid:][p:] DEBUG DependencyResolver:220 - Root nodes remaining: [  ]
2024-05-11 13:40:14.618 [rid:2VWri][jid:][p:] DEBUG RequestResources:55 - Will validate and incorporate derived vars in the following order: 
=====================
CHOOSE VAR

2024-05-11 13:49:03.422 [rid:1DGA8][jid:][p:] INFO  AbstractPlugin:215 - **Request Time: 0ms,** Starting timer
2024-05-11 13:49:09.231 [rid:1DGA8][jid:][p:] INFO  ClientUtil:77 - Will send following POST request to http://localhost:80/computes/alphadiv?autostart=false
{"studyId":"88KWSr3d3om","filters":[],"derivedVariables":[],"config":{"collectionVariable":{"entityId":"MBIOTEMP_Source","collectionId":"EUPATH_0009252"},"alphaDivMethod":"shannon"}}
**?????????????**.   **PROXY ERROR 2mn**
2024-05-11 14:00:56.469 [rid:1DGA8][jid:][p:] INFO  ClientUtil:77 - Will send following POST request to http://localhost:80/computes/alphadiv/meta
2024-05-11 14:00:57.437 [rid:1DGA8][jid:][p:] INFO  AbstractPlugin:215 - **Request Time: 714016ms,** Initial request processing complete
2024-05-11 14:00:57.437 [rid:1DGA8][jid:][p:] INFO  ClientUtil:77 - Will send following POST request to http://localhost:80/merging-internal/query
2024-05-11 14:00:57.445 [rid:2QHhD][jid:][p:] DEBUG PrometheusFilter:92 - Request start: POST /merging-internal/query
2024-05-11 14:00:57.764 [rid:2QHhD][jid:][p:] INFO  ClientUtil:70 - Will send following GET request to http://localhost:80/studies/88KWSr3d3om
**?????????????** MERGING
2024-05-11 14:11:13.142 [rid:2QHhD][jid:][p:] DEBUG DependencyResolver:219 - Ordered nodes: [  ]
2024-05-11 14:11:13.399 [rid:5DG1v][jid:][p:] INFO  JobIDs:53 - Created job ID '6041e03f9602ca46dd7f482758d3f83e'
2024-05-11 14:11:19.150 [rid:1DGA8][jid:][p:] INFO  AbstractPlugin:215 - **Request Time: 1335729ms,** Making requests for data streams
2024-05-11 14:11:19.150 [rid:1DGA8][jid:][p:] INFO  AbstractPlugin:201 - Building and processing 1 required data streams.
2024-05-11 14:11:19.151 [rid:1DGA8][jid:][p:] INFO  ClientUtil:77 - Will send following POST request to http://localhost:80/computes/alphadiv/meta
2024-05-11 14:11:20.045 [rid:1DGA8][jid:][p:] INFO  RServeClient:29 - Connecting to RServe at http://rserve-internal:6311
2024-05-11 14:11:20.068 [rid:1DGA8][jid:][p:] INFO  RServeClient:31 - Connection established
2024-05-11 14:11:20.069 [rid:1DGA8][jid:][p:] INFO  RServeClient:54 - Transferring data stream 'single_tabular_dataset' to RServe
2024-05-11 14:11:20.069 [rid:1DGA8][jid:][p:] INFO  RServeClient:72 - All data streams transferred.
2024-05-11 14:11:20.644 [rid:1DGA8][jid:][p:] INFO  AbstractPlugin:215 - **Request Time: 1337223ms,** Data streams processed; response written; request complete
2024-05-11 14:11:20.645 [rid:1DGA8][jid:][p:] DEBUG PrometheusFilter:165 - Request end: POST /apps/alphadiv/visualizations/boxplot 200
dmgaldi commented 3 months ago

This gap looks like it's time spent constructing a ReferenceMetadata object in merging service:

2024-05-11 13:29:59.638 [rid:2VWri][jid:][p:] INFO  ClientUtil:70 - Will send following GET request to http://localhost:80/studies/88KWSr3d3om
**????????????????**
2024-05-11 13:40:14.617 [rid:2VWri][jid:][p:] DEBUG DependencyResolver:219 - Ordered nodes: [  ]

This would make sense with our theory of the study metadata being too large. I'll check to see if that code is doing something particularly inefficient

aurreco-uga commented 3 months ago

great dan but a reminder that the gaps that are killing the UX are the ones after 13:49:09.231

dmgaldi commented 3 months ago

The good news is I think they're all explained by the same thing. All of the 11 minute gaps are in places where we're constructing the ReferenceMetadata as far as I can tell:

i.e. the gap between

2024-05-11 14:00:57.764 [rid:2QHhD][jid:][p:] INFO  ClientUtil:70 - Will send following GET request to http://localhost:80/studies/88KWSr3d3om
**?????????????** MERGING
2024-05-11 14:11:13.142 [rid:2QHhD][jid:][p:] DEBUG DependencyResolver:219 - Ordered nodes: [  ]
dmgaldi commented 3 months ago

This one, I do need to double check though:

2024-05-11 13:49:09.231 [rid:1DGA8][jid:][p:] INFO  ClientUtil:77 - Will send following POST request to http://localhost:80/computes/alphadiv?autostart=false
{"studyId":"88KWSr3d3om","filters":[],"derivedVariables":[],"config":{"collectionVariable":{"entityId":"MBIOTEMP_Source","collectionId":"EUPATH_0009252"},"alphaDivMethod":"shannon"}}
**?????????????**.   **PROXY ERROR 2mn**
2024-05-11 14:00:56.469 [rid:1DGA8][jid:][p:] INFO  ClientUtil:77 - Will send following POST request to http://localhost:80/computes/alphadiv/meta
dmgaldi commented 3 months ago

Yep, looks like we do the same thing in compute

      var meta = new ReferenceMetadata(
          EDA.getAPIStudyDetail(studyId, auth)
              .orElseThrow(() -> new BadRequestException("Invalid study ID: " + studyId)));
dmgaldi commented 3 months ago

PR to address the ReferenceMetadata latency: https://github.com/VEuPathDB/service-eda/pull/38