LCOGT / mop

Microlensing Observation Portal
GNU General Public License v3.0
0 stars 7 forks source link

Optimize the TargetDetail page #140

Closed rachel3834 closed 5 months ago

rachel3834 commented 5 months ago

The TargetDetail page in MOP for a single object loads slowly. This thread is to investigate optimizations to the view, DB queries etc to improve the UI performance.

As a baseline, here are the results of checkpoints throughout the view and templatetags:

CHECKPOINT: N DB connections: 8, memory: 247.7MiB
TARGETDETAIL: chk 1, time taken 0:00:00.011883
CHECKPOINT: N DB connections: 12, memory: 247.95MiB
TARGETDETAIL: chk 2, time taken 0:00:00.000398
CHECKPOINT: N DB connections: 12, memory: 247.95MiB
TARGETDETAIL: chk 3, time taken 0:00:00.000274
CHECKPOINT: N DB connections: 12, memory: 247.95MiB
TARGETDETAIL: get took 0:00:00.012555
Target post save hook: Gaia23aiy created: False
CHECKPOINT: N DB connections: 318, memory: 275.3MiB
MOP PHOTOMETRY: Got 205 datasets for target Gaia23aiy in 0:00:00.002228
CHECKPOINT: N DB connections: 320, memory: 275.3MiB
MOP PHOTOMETRY: Extracted photometry in 0:00:00.082673
CHECKPOINT: N DB connections: 322, memory: 275.7MiB
MOP PHOTOMETRY: Generated plot in 0:00:00.320835
CHECKPOINT: N DB connections: 324, memory: 299.21MiB
MOP PHOTOMETRY took 0:00:00.405736
MOP INTERFEROMETRY: 2024-02-21 22:23:40.724700
CHECKPOINT: N DB connections: 326, memory: 351.74MiB
MOP INTERFEROMETRY chk 2: 0:00:00.529532
CHECKPOINT: N DB connections: 462, memory: 351.82MiB
MOP INTERFEROMETRY chk 3: 0:00:00.008346
CHECKPOINT: N DB connections: 464, memory: 351.82MiB
MOP INTERFEROMETRY chk 4: 0:00:00.007866
CHECKPOINT: N DB connections: 466, memory: 351.82MiB
MOP INTERFEROMETRY chk 5: 0:00:00.006831
CHECKPOINT: N DB connections: 468, memory: 351.82MiB
MOP INTERFEROMETRY took 0:00:00.552575
rachel3834 commented 5 months ago

As in runTAP and fit_need_events_PSPL, extracting the target extra parameters multiplies the number of querysets from the DB.
For example, mop_extras.mulens_target_data does:

t1 = datetime.utcnow()
utilities.checkpoint()
extras = {k['name']: target.extra_fields.get(k['name'], '') for k in settings.EXTRA_FIELDS if not k.get('hidden')}
utilities.checkpoint()
t2 = datetime.utcnow()
logger.info('MULENS TARGET DATA time taken: ' + str(t2 - t1))

Reports:

CHECKPOINT: N DB connections: 25, memory: 308.81MiB
CHECKPOINT: N DB connections: 183, memory: 309.72MiB
MULENS TARGET DATA time taken: 0:00:00.815782
rachel3834 commented 5 months ago

Replacing this code with:

t1 = datetime.utcnow()
utilities.checkpoint()
#extras = {k['name']: target.extra_fields.get(k['name'], '') for k in settings.EXTRA_FIELDS if not k.get('hidden')}
target_data = {
    'target': mulens.target,
    'extras': mulens.extras,
    'request': request
}
utilities.checkpoint()
t2 = datetime.utcnow()
logger.info('MULENS TARGET DATA time taken: ' + str(t2 - t1))

Gives:

CHECKPOINT: N DB connections: 25, memory: 310.28MiB
CHECKPOINT: N DB connections: 25, memory: 310.28MiB
MULENS TARGET DATA time taken: 0:00:00.001046
rachel3834 commented 5 months ago

Using querytools's functions to efficiently extract the data for a target as an annotated MicrolensingEvent object, I've now updated the mop_extras.mop_photometry function to exploit the data held in the MicrolensingEvent, rather than repeating the DB queries for the ReducedDatums.

The checkpoints from this section of the page now report (compare with the results at the top of this thread for the same page load):

CHECKPOINT: N DB connections: 153, memory: 567.71MiB
MOP PHOTOMETRY: plot chk 1 0:00:00.002238
CHECKPOINT: N DB connections: 153, memory: 567.71MiB
MOP PHOTOMETRY: plot chk 2 0:00:00.001208
CHECKPOINT: N DB connections: 153, memory: 567.71MiB
MOP PHOTOMETRY: plot chk 3 0:00:00.005984
CHECKPOINT: N DB connections: 153, memory: 567.71MiB
MOP PHOTOMETRY: plot chk 4 0:00:00.019717
CHECKPOINT: N DB connections: 153, memory: 567.71MiB
MOP PHOTOMETRY: plot chk 5 0:00:00.004480
MOP PHOTOMETRY: Generated plot in 0:00:00.033627
CHECKPOINT: N DB connections: 153, memory: 567.71MiB
MOP PHOTOMETRY took 0:00:00.034545
rachel3834 commented 5 months ago

Interestingly, the time spent in this function is dominated by the generation of the ploty.go figure. Adding checkpoints between the different stages of the generation of this plot gives:

CHECKPOINT: N DB connections: 153, memory: 314.66MiB
MOP PHOTOMETRY: plot chk 1 0:00:00.029995  # Assigning data arrays to go.Scatter objects
CHECKPOINT: N DB connections: 153, memory: 315.29MiB
MOP PHOTOMETRY: plot chk 2 0:00:00.019054  # Creating the plotly layout
CHECKPOINT: N DB connections: 153, memory: 315.65MiB
MOP PHOTOMETRY: plot chk 3 0:00:00.041642  # Adding the vertical line for the current time
CHECKPOINT: N DB connections: 153, memory: 316.05MiB
MOP PHOTOMETRY: plot chk 4 0:00:00.033222  # Adding the model trace
CHECKPOINT: N DB connections: 153, memory: 316.71MiB
MOP PHOTOMETRY: plot chk 5 0:00:00.021283  # Adding plot annotations and axis labels
MOP PHOTOMETRY: Generated plot in 0:00:00.145196

No obvious ways to speed up this plot generation.

rachel3834 commented 5 months ago

The next major time sink in this view is the mop_extras.interferometry_data function. Implementing the MicrolensingEvent model throughout this function (replacing duplicated DB queries) produces the following revised checkpoint results (compare with the output at the top of this thread):

MOP INTERFEROMETRY: 2024-02-22 00:22:39.602745
CHECKPOINT: N DB connections: 155, memory: 383.81MiB
MOP INTERFEROMETRY chk 2: 0:00:00.002133
CHECKPOINT: N DB connections: 155, memory: 383.97MiB
MOP INTERFEROMETRY chk 3: 0:00:00.000692
CHECKPOINT: N DB connections: 155, memory: 383.97MiB
MOP INTERFEROMETRY chk 4: 0:00:00.000517
CHECKPOINT: N DB connections: 155, memory: 383.97MiB
MOP INTERFEROMETRY chk 5: 0:00:00.000838
CHECKPOINT: N DB connections: 155, memory: 383.97MiB
MOP INTERFEROMETRY took 0:00:00.004180
rachel3834 commented 5 months ago

The interferometry function is followed by a call to the function which produces the Gaia table. Initial checkpoint baseline from unchanged function:

CHECKPOINT: N DB connections: 155, memory: 386.86MiB
GAIA TABLE took 0:00:00.006882
CHECKPOINT: N DB connections: 157, memory: 386.86MiB
rachel3834 commented 5 months ago

Following updates to use the MicrolensingEvent object with pre-loaded data:

CHECKPOINT: N DB connections: 155, memory: 383.29MiB
GAIA TABLE took 0:00:00.000572
CHECKPOINT: N DB connections: 155, memory: 383.29MiB
rachel3834 commented 5 months ago

End-to-end load time for the whole targetpage is now 3.2s, reduced from 4.3s.

rachel3834 commented 5 months ago

Looking into where the time to load a single target page goes, this is the output of checkpointing the various elements of the TargetDetail page template:

STARTING TargetDetail page load: datetime.datetime(2024, 2, 22, 19, 0, 52, 962636)
CHECKPOINT: N DB connections: 7, memory: 247.73MiB
TARGETDETAIL: chk 1, time taken 0:00:00.003411
CHECKPOINT: N DB connections: 9, memory: 247.73MiB
TARGETDETAIL: chk 2, time taken 0:00:00.000336
CHECKPOINT: N DB connections: 9, memory: 247.73MiB
TARGETDETAIL: chk 3, time taken 0:00:00.000275
CHECKPOINT: N DB connections: 9, memory: 247.73MiB
TARGETDETAIL: get took 0:00:00.004022
queryTools: Retrieved associated data for 1 Targets
CHECKPOINT: N DB connections: 15, memory: 247.91MiB
queryTools: Time taken: 0:00:00.000648
queryTools: collating data on microlensing event set
queryTools: collated data for target Gaia23aiy, 0 out of 1
CHECKPOINT: N DB connections: 19, memory: 295.34MiB
queryTools: Collated data for 1 targets in 0:00:00.605010
CHECKPOINT: N DB connections: 19, memory: 295.34MiB
CHECKPOINT: N DB connections: 23, memory: 308.89MiB
CHECKPOINT: N DB connections: 23, memory: 308.89MiB
MULENS TARGET DATA time taken: 0:00:00.002617
CLASS FORM started at 2024-02-22 19:00:54.188471
CHECKPOINT: N DB connections: 63, memory: 309.29MiB
CLASS FORM took 0:00:00.031639
CHECKPOINT: N DB connections: 71, memory: 309.32MiB
MOP PHOTOMETRY: Started at 2024-02-22 19:00:54.780068, got 201 datasets for target Gaia23aiy
CHECKPOINT: N DB connections: 151, memory: 314.63MiB
MOP PHOTOMETRY took 0:00:00.133894
CHECKPOINT: N DB connections: 151, memory: 317.02MiB
PYLIMA MODEL EXTRACT started 2024-02-22 19:00:55.916524
PyLIMA model load took 0:00:00.001111
CHECKPOINT: N DB connections: 153, memory: 385.19MiB
MOP INTERFEROMETRY started at: 2024-02-22 19:00:55.925633
CHECKPOINT: N DB connections: 155, memory: 385.2MiB
CHECKPOINT: N DB connections: 155, memory: 385.2MiB
MOP INTERFEROMETRY took 0:00:00.004165
GAIA NEIGHBOURS started 2024-02-22 19:00:55.944038
CHECKPOINT: N DB connections: 155, memory: 385.23MiB
GAIA NEIGHBOURS took 0:00:00.000685
CHECKPOINT: N DB connections: 155, memory: 385.23MiB
END Targetpage get 2024-02-22 19:00:55.944723

So the whole process in this example took 2.98s. This is broken down as:

Further tests indicate that fetching the context data for the page takes 0.65s, which is dominated by the fetch of the target data. However, from the start of the page load to the start of the template displaying the microlensing parameter data (the first checkpoint in the page template) took 1.15s. The major element unaccounted for between these checkpoints is a TOM templatetag that contacts the LCO Observe Portal to get the list of upcoming observations.

According to the timestamps, the template partial that displays the microlensing parameters (extra parameters) takes about 0.15s to render. This is currently being parsed both a target and the extracted target extra fields, but its calling the extra parameters from the target object rather than the data its being parsed.

Generating the custom photometry plot takes 0.3s, but the time gap between starting that function and starting the next templatetage (the PyLIMA model extraction) took 1.08s. The only action in the template in between is to load spectroscopy for the target, of which there isn't any, but an empty plotly plot is generated. This is a built-in templatetag from the Toolkit.

Plotting the lightcurve does cause a striking increase in the number of DB querysets, 71 to 151.

rachel3834 commented 5 months ago

I also investigated converting the classification form and template to use the Microlensing Event object but this is difficult because the associated form is built around a Target object, which will not validate if a Microlensing Event object is passed instead. Since the timedelta for this step isn't great, I'm holding off changing this for now.

The next most impactful change I can make would be to introduce asynchronous queries to the LCO portal etc.
Some of these queries (e.g. for pending observations) can be quite substantial for a high volume key project. But that's beyond the scope for this issue.